SUMMARY Baloo seems to index my files more than once. STEPS TO REPRODUCE My home directory has 2.4 M files in it: $ find /home/rainer/ -type f | wc -l 2402092 But baloo thinks I have 6 M files: @ balooctl status kf.i18n: KLocalizedString: Using an empty domain, fix the code. msgid: "Unknown" msgid_plural: "" msgctxt: "" kf.i18n: KLocalizedString: Using an empty domain, fix the code. msgid: "Idle" msgid_plural: "" msgctxt: "" Baloo File Indexer is running Indexer state: Idle Total files indexed: 6,396,166 Files waiting for content indexing: 0 Files failed to index: 100 Current size of index is 46.45 GiB And if I "baloosearch something" it outputs each result three or four times. SOFTWARE/OS VERSIONS Linux: OpenSuse Tumbleweed 20221112 KDE Plasma Version: 5.26.3 KDE Frameworks Version: 5.99.0 Qt Version: 5.15.7
Have a look at: https://bugs.kde.org/show_bug.cgi?id=402154#c12 It seems that with BTRFS, with multiple subvols, gives "varying" device numbers - ones that are not stable reboot to reboot. Baloo relies on a combination of the device number and the inode to provide an "id" of the file, and expects a one-to-one relation between the "id" and the filename. If you are using BTRFS and multiple subvols, this breaks down. This catches OpenSuse / Tumbleweed :-( Marking this as Confirmed for now, will probably flag it as a duplicate of Bug 402154 in due course
Also, an earlier, Bug 401863
*** This bug has been marked as a duplicate of bug 401863 ***
I don't have btrfs, I'm on ext4. But I had to copy my home directory once to another device, due to a failing disk. This would explain duplicate entries, but not the same file occurring three on four times. However, I'm on OpenSuse Tumbleweed, with new versions of kernels, systemd, etc. every few days, so maybe devises are getting enumerated differently from time to time and devices ids might not be as stable as one might assume, even for non-btrfs volumes. So I think this is indeed a duplicate!