Bug 471289 - automount/x-systemd-automount nfs resource changes inode on every mount and so reindexes is triggered
Summary: automount/x-systemd-automount nfs resource changes inode on every mount and s...
Status: RESOLVED WORKSFORME
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: Engine (show other bugs)
Version: 5.102.0
Platform: openSUSE Linux
: NOR major
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-21 11:39 UTC by Diego Ercolani
Modified: 2024-08-06 03:46 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Diego Ercolani 2023-06-21 11:39:21 UTC
I have a situation where I want to index files that are published via an nfs resource published by a truenas appliance.
In the log file every day a reindex of resources already indexed is triggered by baloo, this give two drawbacks:

1. the index file size is exploding
2. the reindex is triggered without a "real" need.

This is what I' talking about:
New reindex triggered, after a file is correctly reindexed:

diego@pc-diego:~: balooshow -x "/net/fileserver4/Segreteria/XXXX.pdf"
1879000000fb 251 6265 /net/fileserver4/Segreteria/XXXX.pdf
        Mtime: 1583398981 2020-03-05T10:03:01
        Ctime: 1672915819 2023-01-05T11:50:19
        Cached properties:
                Autore: xxx.yyy
                Titolo: Spett
                Documento generato da: Microsoft® Word 2016
                Conto delle pagine: 1
                Data di creazione: 2020-03-05T09:03:01.000Z

Informazioni interne
Termini: x x x x
pageCount: 1
generator: 2016 microsoft word ®
author: xxx yyy
title: spett
creationDate: 2020-03-05T09:03:01Z

After a while, the directory is unmounted by systemd
if a ask baloo again:
diego@pc-diego:~: balooshow -x "/net/fileserver4/Segreteria/XXXX.pdf"
18790010000c 1048588 6265 /net/fileserver4/Segreteria/XXXX.pdf: nessuna informazione trovata nell'indice

that is telling that baloo has completely forget the file
but if I issue:
baloosearch -i "XXXX.pdf"
baloo found correctly:
187900000084 /net/fileserver4/Segreteria/XXXX.pdf
187900100010 /net/fileserver4/Segreteria/XXXX.pdf
18790010000f /net/fileserver4/Segreteria/XXXX.pdf
1879000000fb /net/fileserver4/Segreteria/XXXX.pdf
1879000000eb /net/fileserver4/Segreteria/XXXX.pdf
1879000000d8 /net/fileserver4/Segreteria/XXXX.pdf
1879000000d4 /net/fileserver4/Segreteria/XXXX.pdf
1879000000d3 /net/fileserver4/Segreteria/XXXX.pdf
1879000000c0 /net/fileserver4/Segreteria/XXXX.pdf

but, as you see, it found also other "id"
Comment 1 tagwerk19 2023-06-21 12:25:15 UTC
(In reply to Diego Ercolani from comment #0)
> ...
> 187900000084 /net/fileserver4/Segreteria/XXXX.pdf
> 187900100010 /net/fileserver4/Segreteria/XXXX.pdf
> 18790010000f /net/fileserver4/Segreteria/XXXX.pdf
> 1879000000fb /net/fileserver4/Segreteria/XXXX.pdf
> ...
That looks like the inode is OK but the device number is changing each reboot/remount. This is something that's been affecting BTRFS mounts, interesting to see that it's catching NFS as well.

There is a merge request drafted, specifically for the BTRFS case, that might also deal with this:
    https://invent.kde.org/frameworks/baloo/-/merge_requests/131

I might wave a flag for a better (but likely too difficult) solution; that servers index content they host locally and clients that mount the such resources forward search queries to the hosts for resolution there. Can see this would need too many bits to be in place and work together.
Comment 2 Tomas Trnka 2023-07-04 08:35:46 UTC
(In reply to tagwerk19 from comment #1)
> That looks like the inode is OK but the device number is changing each
> reboot/remount. This is something that's been affecting BTRFS mounts,
> interesting to see that it's catching NFS as well.

Yes, of the most common filesystems on Linux, Btrfs, NFS, and CIFS (a.k.a. SMB) all use dynamically allocated device numbers, so they are all affected the same way.

> There is a merge request drafted, specifically for the BTRFS case, that
> might also deal with this:
>     https://invent.kde.org/frameworks/baloo/-/merge_requests/131

It will probably help, depending on server configuration. The Linux NFS server will by default use the FSID of the underlying filesystem when presenting an export to clients, so the Linux NFS client should expose an unique FSID. However, the server could be running something like XFS as the underlying filesystem (which does not have a stable FSID), or the exported FSID can be overridden in server configuration, possibly making it non-unique across servers. If server A exports a custom fsid=123 and server B exports a different filesystem with the same custom fsid, a client mounting both filesystems will see two unrelated trees with the same FSID.
Comment 3 Nate Graham 2023-09-07 15:52:51 UTC
Git commit c735faf5a6a3ef3d29882552ad0a9264a294e038 by Nate Graham, on behalf of Tomáš Trnka.
Committed on 07/09/2023 at 17:52.
Pushed by ngraham into branch 'master'.

Use the FSID as the device identifier where possible

The device number returned by stat() in st_dev is not persistent in many
cases. Btrfs subvolumes or partitions on NVMe devices are assigned
device numbers dynamically, so the resulting device ID is typically
different after every reboot, forcing Baloo to repeatedly reindex all
files.

Fortunately, filesystems like Btrfs or ext4 return a persistent
unique filesystem ID as f_fsid from statvfs(), so we can use that when
available. Other filesystems like XFS derive the FSID from the device
number of the underlying block device, so switching to the FSID does not
change anything.
Related: bug 402154

M  +21   -1    src/engine/idutils.h

https://invent.kde.org/frameworks/baloo/-/commit/c735faf5a6a3ef3d29882552ad0a9264a294e038
Comment 4 Bug Janitor Service 2023-09-07 15:54:47 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/baloo/-/merge_requests/169
Comment 5 Nate Graham 2023-09-07 15:56:45 UTC
Git commit 7a1c09ed1b7aa9a7a093f8d715b42a4aedd0f7b6 by Nate Graham, on behalf of Tomáš Trnka.
Committed on 07/09/2023 at 17:54.
Pushed by ngraham into branch 'kf5'.

Use the FSID as the device identifier where possible

The device number returned by stat() in st_dev is not persistent in many
cases. Btrfs subvolumes or partitions on NVMe devices are assigned
device numbers dynamically, so the resulting device ID is typically
different after every reboot, forcing Baloo to repeatedly reindex all
files.

Fortunately, filesystems like Btrfs or ext4 return a persistent
unique filesystem ID as f_fsid from statvfs(), so we can use that when
available. Other filesystems like XFS derive the FSID from the device
number of the underlying block device, so switching to the FSID does not
change anything.
Related: bug 402154


(cherry picked from commit c735faf5a6a3ef3d29882552ad0a9264a294e038)

M  +21   -1    src/engine/idutils.h

https://invent.kde.org/frameworks/baloo/-/commit/7a1c09ed1b7aa9a7a093f8d715b42a4aedd0f7b6
Comment 6 Maximilian Böhm 2023-09-11 13:24:40 UTC
Can this important fix be backported to Frameworks 5.x?
Comment 7 tagwerk19 2024-07-07 15:09:26 UTC
I am assuming the dust has settled here after:
    https://invent.kde.org/frameworks/baloo/-/merge_requests/131
and cherrypicked for KF5
    https://invent.kde.org/frameworks/baloo/-/merge_requests/169
These were aimed at BTRFS but seemingly they'll address the NFS issues. Thank you Tomas!

There has also been a couple of other useful patches that I reference for completeness:
     https://invent.kde.org/frameworks/baloo/-/merge_requests/121
and
     https://invent.kde.org/frameworks/baloo/-/merge_requests/148

I'll set to "WaitingForInfo" for the interim, if you are still hitting trouble with NFS add a comment and reset.
Comment 8 Bug Janitor Service 2024-07-22 03:46:21 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 9 Bug Janitor Service 2024-08-06 03:46:34 UTC
🐛🧹 This bug has been in NEEDSINFO status with no change for at least 30 days. Closing as RESOLVED WORKSFORME.