Bug 461820 - Baloo seems to index my files more than once.
Summary: Baloo seems to index my files more than once.
Status: RESOLVED DUPLICATE of bug 401863
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.99.0
Platform: openSUSE Linux
: NOR normal
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-14 14:13 UTC by Rainer Sabelka
Modified: 2022-11-21 09:37 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rainer Sabelka 2022-11-14 14:13:12 UTC
SUMMARY
Baloo seems to index my files more than once.

STEPS TO REPRODUCE
My home directory has 2.4 M files in it:
$ find /home/rainer/ -type f | wc -l
2402092

But baloo thinks I have 6 M files:

@ balooctl status
kf.i18n: KLocalizedString: Using an empty domain, fix the code. msgid: "Unknown" msgid_plural: "" msgctxt: ""
kf.i18n: KLocalizedString: Using an empty domain, fix the code. msgid: "Idle" msgid_plural: "" msgctxt: ""
Baloo File Indexer is running
Indexer state: Idle
Total files indexed: 6,396,166
Files waiting for content indexing: 0
Files failed to index: 100
Current size of index is 46.45 GiB

And if I "baloosearch something" it outputs each result three or four times.

SOFTWARE/OS VERSIONS
Linux: OpenSuse Tumbleweed 20221112
KDE Plasma Version:  5.26.3
KDE Frameworks Version: 5.99.0
Qt Version: 5.15.7
Comment 1 tagwerk19 2022-11-20 17:55:00 UTC
Have a look at:
    https://bugs.kde.org/show_bug.cgi?id=402154#c12

It seems that with BTRFS, with multiple subvols, gives "varying" device numbers - ones that are not stable reboot to reboot.

Baloo relies on a combination of the device number and the inode to provide an "id" of the file, and expects a one-to-one relation between the "id" and the filename. If you are using BTRFS and multiple subvols, this breaks down. This catches OpenSuse / Tumbleweed :-(

Marking this as Confirmed for now, will probably flag it as a duplicate of Bug 402154 in due course
Comment 2 tagwerk19 2022-11-20 23:01:21 UTC
Also, an earlier, Bug 401863
Comment 3 tagwerk19 2022-11-20 23:02:16 UTC

*** This bug has been marked as a duplicate of bug 401863 ***
Comment 4 Rainer Sabelka 2022-11-21 09:37:04 UTC
I don't have btrfs, I'm on ext4.
But I had to copy my home directory once to another device, due to a failing disk. This would explain duplicate entries, but not the same file occurring three on four times.
However, I'm on OpenSuse Tumbleweed, with new  versions of kernels, systemd, etc. every few days, so maybe devises are getting enumerated differently from time to time and devices ids might not be as stable as one might assume, even for non-btrfs volumes.
So I think this is indeed a duplicate!