I have a situation where I want to index files that are published via an nfs resource published by a truenas appliance. In the log file every day a reindex of resources already indexed is triggered by baloo, this give two drawbacks: 1. the index file size is exploding 2. the reindex is triggered without a "real" need. This is what I' talking about: New reindex triggered, after a file is correctly reindexed: diego@pc-diego:~: balooshow -x "/net/fileserver4/Segreteria/XXXX.pdf" 1879000000fb 251 6265 /net/fileserver4/Segreteria/XXXX.pdf Mtime: 1583398981 2020-03-05T10:03:01 Ctime: 1672915819 2023-01-05T11:50:19 Cached properties: Autore: xxx.yyy Titolo: Spett Documento generato da: Microsoft® Word 2016 Conto delle pagine: 1 Data di creazione: 2020-03-05T09:03:01.000Z Informazioni interne Termini: x x x x pageCount: 1 generator: 2016 microsoft word ® author: xxx yyy title: spett creationDate: 2020-03-05T09:03:01Z After a while, the directory is unmounted by systemd if a ask baloo again: diego@pc-diego:~: balooshow -x "/net/fileserver4/Segreteria/XXXX.pdf" 18790010000c 1048588 6265 /net/fileserver4/Segreteria/XXXX.pdf: nessuna informazione trovata nell'indice that is telling that baloo has completely forget the file but if I issue: baloosearch -i "XXXX.pdf" baloo found correctly: 187900000084 /net/fileserver4/Segreteria/XXXX.pdf 187900100010 /net/fileserver4/Segreteria/XXXX.pdf 18790010000f /net/fileserver4/Segreteria/XXXX.pdf 1879000000fb /net/fileserver4/Segreteria/XXXX.pdf 1879000000eb /net/fileserver4/Segreteria/XXXX.pdf 1879000000d8 /net/fileserver4/Segreteria/XXXX.pdf 1879000000d4 /net/fileserver4/Segreteria/XXXX.pdf 1879000000d3 /net/fileserver4/Segreteria/XXXX.pdf 1879000000c0 /net/fileserver4/Segreteria/XXXX.pdf but, as you see, it found also other "id"
(In reply to Diego Ercolani from comment #0) > ... > 187900000084 /net/fileserver4/Segreteria/XXXX.pdf > 187900100010 /net/fileserver4/Segreteria/XXXX.pdf > 18790010000f /net/fileserver4/Segreteria/XXXX.pdf > 1879000000fb /net/fileserver4/Segreteria/XXXX.pdf > ... That looks like the inode is OK but the device number is changing each reboot/remount. This is something that's been affecting BTRFS mounts, interesting to see that it's catching NFS as well. There is a merge request drafted, specifically for the BTRFS case, that might also deal with this: https://invent.kde.org/frameworks/baloo/-/merge_requests/131 I might wave a flag for a better (but likely too difficult) solution; that servers index content they host locally and clients that mount the such resources forward search queries to the hosts for resolution there. Can see this would need too many bits to be in place and work together.
(In reply to tagwerk19 from comment #1) > That looks like the inode is OK but the device number is changing each > reboot/remount. This is something that's been affecting BTRFS mounts, > interesting to see that it's catching NFS as well. Yes, of the most common filesystems on Linux, Btrfs, NFS, and CIFS (a.k.a. SMB) all use dynamically allocated device numbers, so they are all affected the same way. > There is a merge request drafted, specifically for the BTRFS case, that > might also deal with this: > https://invent.kde.org/frameworks/baloo/-/merge_requests/131 It will probably help, depending on server configuration. The Linux NFS server will by default use the FSID of the underlying filesystem when presenting an export to clients, so the Linux NFS client should expose an unique FSID. However, the server could be running something like XFS as the underlying filesystem (which does not have a stable FSID), or the exported FSID can be overridden in server configuration, possibly making it non-unique across servers. If server A exports a custom fsid=123 and server B exports a different filesystem with the same custom fsid, a client mounting both filesystems will see two unrelated trees with the same FSID.
Git commit c735faf5a6a3ef3d29882552ad0a9264a294e038 by Nate Graham, on behalf of Tomáš Trnka. Committed on 07/09/2023 at 17:52. Pushed by ngraham into branch 'master'. Use the FSID as the device identifier where possible The device number returned by stat() in st_dev is not persistent in many cases. Btrfs subvolumes or partitions on NVMe devices are assigned device numbers dynamically, so the resulting device ID is typically different after every reboot, forcing Baloo to repeatedly reindex all files. Fortunately, filesystems like Btrfs or ext4 return a persistent unique filesystem ID as f_fsid from statvfs(), so we can use that when available. Other filesystems like XFS derive the FSID from the device number of the underlying block device, so switching to the FSID does not change anything. Related: bug 402154 M +21 -1 src/engine/idutils.h https://invent.kde.org/frameworks/baloo/-/commit/c735faf5a6a3ef3d29882552ad0a9264a294e038
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/baloo/-/merge_requests/169
Git commit 7a1c09ed1b7aa9a7a093f8d715b42a4aedd0f7b6 by Nate Graham, on behalf of Tomáš Trnka. Committed on 07/09/2023 at 17:54. Pushed by ngraham into branch 'kf5'. Use the FSID as the device identifier where possible The device number returned by stat() in st_dev is not persistent in many cases. Btrfs subvolumes or partitions on NVMe devices are assigned device numbers dynamically, so the resulting device ID is typically different after every reboot, forcing Baloo to repeatedly reindex all files. Fortunately, filesystems like Btrfs or ext4 return a persistent unique filesystem ID as f_fsid from statvfs(), so we can use that when available. Other filesystems like XFS derive the FSID from the device number of the underlying block device, so switching to the FSID does not change anything. Related: bug 402154 (cherry picked from commit c735faf5a6a3ef3d29882552ad0a9264a294e038) M +21 -1 src/engine/idutils.h https://invent.kde.org/frameworks/baloo/-/commit/7a1c09ed1b7aa9a7a093f8d715b42a4aedd0f7b6
Can this important fix be backported to Frameworks 5.x?
I am assuming the dust has settled here after: https://invent.kde.org/frameworks/baloo/-/merge_requests/131 and cherrypicked for KF5 https://invent.kde.org/frameworks/baloo/-/merge_requests/169 These were aimed at BTRFS but seemingly they'll address the NFS issues. Thank you Tomas! There has also been a couple of other useful patches that I reference for completeness: https://invent.kde.org/frameworks/baloo/-/merge_requests/121 and https://invent.kde.org/frameworks/baloo/-/merge_requests/148 I'll set to "WaitingForInfo" for the interim, if you are still hitting trouble with NFS add a comment and reset.
Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone!
🐛🧹 This bug has been in NEEDSINFO status with no change for at least 30 days. Closing as RESOLVED WORKSFORME.