Created attachment 127049 [details] dolphin search duplicates When searching for a file in the dolphin search engine I get a ton of duplicates. Those the 32.png in the attachement for example is only spread over about 5 folders but it shows 34 results. Somnething is not right here.
what does running `baloosearch 32.png` in a terminal window show you?
It basically shows 2-3 duplicates for each entry
Moving to Baloo, then. There seems to be an issue with your database.
Steps to reproduce: 1. Enter any directory 2. ´touch baloon´ 3. `baloosearch baloon` 4. `rm baloon` 5. `baloosearch baloon` Repeat an infinite amount of times to create an infinite number of duplicates :)
I cam confirm this bug. Operating System: openSUSE Leap 15.2 KDE Plasma Version: 5.18.5 KDE Frameworks Version: 5.71.0 Qt Version: 5.12.7 Kernel Version: 5.3.18-lp152.50-default OS Type: 64-bit Processors: 4 × Intel® Core™ i3-7100U CPU @ 2.40GHz Memory: 7,2 ГіБ
*** Bug 429283 has been marked as a duplicate of this bug. ***
Every single file is duplicated twice in Dolphin search and in baloosearch. KDE Frameworks 5.78.0 Qt 5.15.2 (built against 5.15.2)
You see this is on openSuse / another distribution with BTRFS and multiple subvolumes? See Bug 402154, specifically see whether the device number of your home directory change on reboot. If that happens it seems that baloo reindexes your files and shows multiple hits...
It's interesting that while Dolphin and Baloo search show duplicate files, Milou, the Plasma search widget, doesn't. Does it deduplicate the search results?
(In reply to David Palacio from comment #9) > It's interesting that while Dolphin and Baloo search show duplicate files, > Milou, the Plasma search widget, doesn't. Does it deduplicate the search > results? I can say that I just get the one hit with krunner/search widget, seems likely some sanitising is happening...
It could be nice to have confirmation by the OP that this problem occurs with btrfs, but it seems highly likely that this is a duplicate of bug 402154. Since the latter is not very findable (misleading title, long discussion), I guess this could stay open. A few comments / considerations: - I can confirm the file search in the menu is not affected by duplicated results, so at least there should be a way to fix the appearance in Dolphin even if duplicates are present in the index; - I am not up to date w.r.t. about how btrfs adoption is evolving in the wild, but with major distros such as OpenSUSE and Fedora on board the userbase is becoming pretty large. A warning message in the File Search config module about support for btrfs being "experimental" could be welcome, but I am not sure what is the KDE policy about this kind of thing; - is it possible to think of a "sanification" routine for the file index, dedicated to the detection / elimination of duplicate file entries?
(In reply to Massimiliano L from comment #11) > It could be nice to have confirmation by the OP that this problem occurs > with btrfs, but it seems highly likely that this is a duplicate of bug > 402154. Since the latter is not very findable (misleading title, long > discussion), I guess this could stay open. It's certainly the case that the openSUSE config, BTRFS with multiple subvols, causes this symptom (and is still the case). However I know I've also encountered this elsewhere. It's something that happened frequently for me a couple of years back but quite rarely now, that was with Fedora (BTRFS) but also, I suspect, with Neon (ext4).
Assuming BTRFS since everything else fits. *** This bug has been marked as a duplicate of bug 401863 ***
Actually I am on ext4, and I am pretty sure I was on ext4 back then, too. However I do not use plasma anymore so cannot comment further on the issue. However trying my repro steps from https://bugs.kde.org/show_bug.cgi?id=419302#c4 might be an indicator if the bugs are connected :)
In one of those bug reports we already established that this is not just about BTRFS. This is about kernel device major:minor numbers not guaranteed to be stable in various circumstances. We discussed all of this before. I even asked kernel developers. See here: https://bugs.kde.org/show_bug.cgi?id=438434#c14 In there I wrote: Neil Brown clearly said that no userspace component can rely on device numbers since kernel 2.4. Luckily he recommended an alternative: "That is really hard to provide in general. Possibly the best approach is to use the statfs() systemcall to get the "f_fsid" field. This is 64bits. It is not supported uniformly well by all filesystems, but I think it is at least not worse than using the device number. For a lot of older filesystems it is just an encoding of the device number. For btrfs, xfs, ext4 it is much much better." https://lore.kernel.org/linux-block/1769070.0rzTUBzp5V@ananda/T/#m28b8c889c9289ad1ec76cbf040938ea883e3f375