Bug 474973 - baloosearch (as well as Dolphin and other apps using baloo) do not find anything when searching files inside the Downloads directory
Summary: baloosearch (as well as Dolphin and other apps using baloo) do not find anyth...
Status: REPORTED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.110.0
Platform: Fedora RPMs Linux
: NOR normal
Target Milestone: ---
Assignee: baloo-bugs-null
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-28 08:44 UTC by Marco
Modified: 2024-07-07 15:56 UTC (History)
7 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Results from attempts to search for a specific file using baloosearch, and outputs of balooinfo for that file and for two others that can be found normally (2.72 KB, text/plain)
2023-10-07 19:27 UTC, John Kizer
Details
Double HOME entries (156.33 KB, image/png)
2023-10-13 07:12 UTC, Marco
Details
Baloo example "index" (132.00 KB, application/octet-stream)
2023-11-14 21:36 UTC, tagwerk19
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marco 2023-09-28 08:44:44 UTC
SUMMARY
when executing on the terminal, e.g.:
baloosearch --directory /home/<myusername> <keyword>
I do not get any results related to files in the Downloads directory. Even specifying the full path /home/<user>/Downloads, I obtain the same.

However, if I run:
baloosearch <keyword>
I am able to find the files matching the keyword in my Downloads directory.

The same happens with Dolphin.

Note: Before you ask, yes: baloo is running, and the home directory is being indexed


STEPS TO REPRODUCE
1.  Open a terminal
2. Run baloosearch /home/<username> <keyword> with a keyword matching some file name in you downloads directory

OBSERVED RESULT
No results matching the keyword for files in the Downloads directory

EXPECTED RESULT
Getting the list of files in the Downloads directory matching the keyword

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Fedora 38
KDE Plasma Version: 5.27.8
KDE Frameworks Version: 5.110.0
Qt Version: 5.15.10

ADDITIONAL INFORMATION
This issue was present also in older versions of Plasma.
Comment 1 tagwerk19 2023-10-02 07:31:40 UTC
(In reply to Marco from comment #0)
> Note: Before you ask, yes: baloo is running, and the home directory is being indexed
:-)

I'll deal with this one first just in case anyone else has followed a path to this bug report.

By default Fedora indexes a subset of directories, ~/Documents, ~/Pictures, ~/Music and ~/Videos (with Baloo). That is, it doesn't index your home directory by default. If you want to search for anything with Dolphin, starting "from home", Dolphin will use it's own "there and then" search, find it's way down through the subdirectories (including ~/Downloads) and give you any matches. The command line baloosearch won't find any matches as baloo hasn't indexed ~/Downloads.

There's a description of the differences between Dolphin's own search and baloo here: Bug 463830

The above does not apply in your case as you've set Fedora to index everything.

Second option might be that you have a symlink to your ~/Downloads, from somewhere under your home folder, maybe (this is how it caught me), in your ~/Desktop folder (could also be that you have your Downloads folder on another disc?). Symlinks can get messy and generally baloo avoids following them when it's doing its indexing.

What is tricky to work out without additional information is why:
    baloosearch <keyword>
works and 
    baloosearch --directory /home/<myusername> <keyword>
doesn't. For one of the files that the first finds and the second does not:
    balooshow -x <thecuriousfile>
Have a look for your filename and a different filename within [ square brackets ]

There's a description of just how tangled symlinks can get here: Bug 447119

... I'm wondering about case "2a" as this seems to best match your symptoms.
Comment 2 John Kizer 2023-10-07 19:27:41 UTC
Created attachment 162150 [details]
Results from attempts to search for a specific file using baloosearch, and outputs of balooinfo for that file and for two others that can be found normally
Comment 3 John Kizer 2023-10-07 19:28:15 UTC
(In reply to tagwerk19 from comment #1)
> (In reply to Marco from comment #0)
> > Note: Before you ask, yes: baloo is running, and the home directory is being indexed
> :-)
> 
> I'll deal with this one first just in case anyone else has followed a path
> to this bug report.
> 
> By default Fedora indexes a subset of directories, ~/Documents, ~/Pictures,
> ~/Music and ~/Videos (with Baloo). That is, it doesn't index your home
> directory by default. If you want to search for anything with Dolphin,
> starting "from home", Dolphin will use it's own "there and then" search,
> find it's way down through the subdirectories (including ~/Downloads) and
> give you any matches. The command line baloosearch won't find any matches as
> baloo hasn't indexed ~/Downloads.
> 
> There's a description of the differences between Dolphin's own search and
> baloo here: Bug 463830
> 
> The above does not apply in your case as you've set Fedora to index
> everything.
> 
> Second option might be that you have a symlink to your ~/Downloads, from
> somewhere under your home folder, maybe (this is how it caught me), in your
> ~/Desktop folder (could also be that you have your Downloads folder on
> another disc?). Symlinks can get messy and generally baloo avoids following
> them when it's doing its indexing.
> 
> What is tricky to work out without additional information is why:
>     baloosearch <keyword>
> works and 
>     baloosearch --directory /home/<myusername> <keyword>
> doesn't. For one of the files that the first finds and the second does not:
>     balooshow -x <thecuriousfile>
> Have a look for your filename and a different filename within [ square
> brackets ]
> 
> There's a description of just how tangled symlinks can get here: Bug 447119
> 
> ... I'm wondering about case "2a" as this seems to best match your symptoms.

Hi - Does the attachment I just added show what might be helpful in this case?
Comment 4 tagwerk19 2023-10-08 08:38:37 UTC
(In reply to John Kizer from comment #3)
> Hi - Does the attachment I just added show what might be helpful in this case?

Snip...
> $ baloosearch -d /home/johnkizer/Documents/ Erika-9th
> Elapsed: 0.316861 msecs
> 
> $ baloosearch Erika-9th
> /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
See what
    $ baloosearch -i Erika-9th
gives you...

The "-i" tells baloosearch to include the ID it has for the file in its index. This is based on the minor device number (more or less) and inode. What you are seeing is typical for openSUSE (that uses BTRFS intensively) and, up until recently, Fedora. The problem is that you can get a different minor device number each reboot and baloo will reindex the file each time - my guess is that you'll see each "hit" given with a different ID. 

I've not seen this issue affect the issue affecting "baloosearch -d" filtering, but it anything is possible.

There's a patch coming for this with Frameworks 5.111, middle of this month:
    https://bugs.kde.org/show_bug.cgi?id=402154#c62

After the patch it would be sensible to clear and rebuild the index (with a "balooctl purge" and some patience). You don't have to but it would clear out all the old records which is probably good in your case.
Comment 5 John Kizer 2023-10-08 21:24:29 UTC
(In reply to tagwerk19 from comment #4)
> See what
>     $ baloosearch -i Erika-9th
> gives you...
> 
> The "-i" tells baloosearch to include the ID it has for the file in its
> index. This is based on the minor device number (more or less) and inode.
> What you are seeing is typical for openSUSE (that uses BTRFS intensively)
> and, up until recently, Fedora. The problem is that you can get a different
> minor device number each reboot and baloo will reindex the file each time -
> my guess is that you'll see each "hit" given with a different ID. 
> 
> I've not seen this issue affect the issue affecting "baloosearch -d"
> filtering, but it anything is possible.
> 
> There's a patch coming for this with Frameworks 5.111, middle of this month:
>     https://bugs.kde.org/show_bug.cgi?id=402154#c62
> 
> After the patch it would be sensible to clear and rebuild the index (with a
> "balooctl purge" and some patience). You don't have to but it would clear
> out all the old records which is probably good in your case.

Thanks, yes that shows as you mentioned:

baloosearch -i Erika-9th
412da300000029 /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
412da30000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
412da30000002b /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
412da30000002d /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
412da30000002f /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
4032ae0000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
4019b80000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
357b0f0000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
357b0f0000002b /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
357b0f0000002c /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
357b0f0000002d /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
357b0f0000002f /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
3578fa0000002c /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
30c9f70000002c /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
Elapsed: 0.299239 msecs

So probably just a "fixed in a future version" item?

Thanks!
Comment 6 tagwerk19 2023-10-09 06:20:43 UTC
(In reply to John Kizer from comment #5)
> ...
> 412da30000002d /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> 412da30000002f /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> 4032ae0000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> 4019b80000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> 357b0f0000002a /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> 357b0f0000002b /home/johnkizer/Documents/Erika-9th-birthday-planning-2023.ods
> ...
> 
> So probably just a "fixed in a future version" item?
Probably or Possibly :-)

You can see the different Minor Device numbers that Baloo has seen, the 2a, 2b, 2d, 2f etc. What interesting is that the inode has also jumped around.  What filesystem are you using? A remote mount? Something encrypted?

The original author was on Fedora, that should already have the patch....
Comment 7 Marco 2023-10-09 14:19:27 UTC
I have no symlink of my Download directory anywhere in my home, and the files I am trying to search do not have any square brackets in their name. Simply, when I ask dophin to search over "my files", it is able to find them. If ask it to search "from here" it does not, precisely the same behaviour baloosearch has from the command line.
Comment 8 John Kizer 2023-10-09 16:28:40 UTC
(In reply to tagwerk19 from comment #6)
> > So probably just a "fixed in a future version" item?
> Probably or Possibly :-)
> 
> You can see the different Minor Device numbers that Baloo has seen, the 2a,
> 2b, 2d, 2f etc. What interesting is that the inode has also jumped around. 
> What filesystem are you using? A remote mount? Something encrypted?
> 
> The original author was on Fedora, that should already have the patch....

My filesystem setup is, I believe, the default one offered by openSUSE Tumbleweed on install:

$ lsblk -f
NAME        FSTYPE FSVER LABEL FSAVAIL FSUSE% MOUNTPOINTS
nvme0n1                                                                            
├─nvme0n1p1 vfat   FAT32       495.8M     3% /boot/efi
├─nvme0n1p2 btrfs                   181.9G    62% /var
│                                                                              /srv
│                                                                              /usr/local
│                                                                              /root
│                                                                              /opt
│                                                                              /home
│                                                                              /boot/grub2/x86_64-efi
│                                                                              /boot/grub2/i386-pc
│                                                                              /.snapshots
│                                                                              /
└─nvme0n1p3 swap   1                           [SWAP]

It looked from that comment that the fix went into 5.111, which I don't think is in either Fedora or openSUSE yet? (at least based on https://packages.fedoraproject.org/pkgs/kf5-baloo/kf5-baloo/ and https://build.opensuse.org/package/show/openSUSE:Factory/baloo5)
Comment 9 tagwerk19 2023-10-09 16:51:12 UTC
(In reply to Marco from comment #7)
> Simply, when I ask dophin to search over "my files", it is able to find
> them. If ask it to search "from here" it does not, precisely the same
> behaviour baloosearch has from the command line.
Can you run a:
    baloosearch -i xxx
where xxx is a file in your Downloads folder. If I understand it, baloosearch works here...

Repeat with:
    baloosearch -d ~/Downloads -i xxx
where baloosearch fails...

then try a:
    balooshow -x ~/Downloads/xxx
and copy the first lines of output into this bug...

Second question, what happens if you create a new file in your Downloads folder, something really minimal like:
    echo "Hello Penguin" > ~/Downloads/testfile.txt
can baloosearch find a new/simple file?
Comment 10 tagwerk19 2023-10-09 16:56:16 UTC
(In reply to John Kizer from comment #8)
> My filesystem setup is, I believe, the default one offered by openSUSE
> Tumbleweed on install:
I have a tumbleweed I keep an eye on, let's see how it behaves when it gets 5.111

> It looked from that comment that the fix went into 5.111, which I don't
> think is in either Fedora or openSUSE yet? 
Fedora seems to have "cherry picked" the change:
    https://invent.kde.org/frameworks/baloo/-/merge_requests/131#note_712705
Nice of them to do some beta testing :-)
Comment 11 Nate Graham 2023-10-11 21:29:48 UTC
*** Bug 414805 has been marked as a duplicate of this bug. ***
Comment 12 Marco 2023-10-12 07:37:21 UTC
(In reply to tagwerk19 from comment #9)
> (In reply to Marco from comment #7)
> > Simply, when I ask dophin to search over "my files", it is able to find
> > them. If ask it to search "from here" it does not, precisely the same
> > behaviour baloosearch has from the command line.
> Can you run a:
>     baloosearch -i xxx
> where xxx is a file in your Downloads folder. If I understand it,
> baloosearch works here...
> 
> Repeat with:
>     baloosearch -d ~/Downloads -i xxx
> where baloosearch fails...
> 
> then try a:
>     balooshow -x ~/Downloads/xxx
> and copy the first lines of output into this bug...
> 
> Second question, what happens if you create a new file in your Downloads
> folder, something really minimal like:
>     echo "Hello Penguin" > ~/Downloads/testfile.txt
> can baloosearch find a new/simple file?

Here is the result of running the balooshow -x command after searching in both ways (which, as you also expected, worked without the -d argument, but did not with it); I searched for the file "clickhouse" inside the Downloads directory (which is translated to Scaricati in italian).

2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse
        Mtime: 1697027838 2023-10-11T14:37:18
        Ctime: 1697027838 2023-10-11T14:37:18

Internal informations
Filename terms: Fclickhouse 
XAttr terms: 
Simple text terms: 
Property terms: Mapplication Moctet Mstream 

(It seems baloo has been translated to multiple languages, and I was getting output in Italian, so I tried to translate to english the output of balooshow).

I also tried creating a simple file "testfile.txt" in the downloads directory, but the same issue occurs. If it helps, I have disapled file content indexing to keep the index small and to not overload my system, so the content of the file should not matter.
Comment 13 tagwerk19 2023-10-12 10:11:02 UTC
(In reply to Marco from comment #12)
> 2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse
So:

    balooshow -x ~/Downloads/clickhouse 

gave you:

    2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse

That doesn't make a lot of sense. It sounds like there's too much (or too little) translation going on (*). Does:

    balooshow -x ~/Scaricati/clickhouse

work? and do you get the problem just with the Downloads/Scaricati directory or also on other "well known" directories (Images/Immagini?).

What happens if filtering on the Italian directory name:

    baloosearch -d ~/Scaricati -i clickhouse

Could you have a Downloads directory as well as a Scariati directory? Sorry, loads of questions :-)

*)  The baloo_file indexer sees the file under ~/Scaricati but balooshow and baloosearch consider that it is under ~/Downloads. Is it actually under ~/Downloads?
Comment 14 Marco 2023-10-12 14:39:17 UTC
(In reply to tagwerk19 from comment #13)
> (In reply to Marco from comment #12)
> > 2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse
> So:
> 
>     balooshow -x ~/Downloads/clickhouse 
> 
> gave you:
> 
>     2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse
> 
> That doesn't make a lot of sense. It sounds like there's too much (or too
> little) translation going on (*). Does:
> 
>     balooshow -x ~/Scaricati/clickhouse
> 
> work? and do you get the problem just with the Downloads/Scaricati directory
> or also on other "well known" directories (Images/Immagini?).
> 
> What happens if filtering on the Italian directory name:
> 
>     baloosearch -d ~/Scaricati -i clickhouse
> 
> Could you have a Downloads directory as well as a Scariati directory? Sorry,
> loads of questions :-)
> 
> *)  The baloo_file indexer sees the file under ~/Scaricati but balooshow and
> baloosearch consider that it is under ~/Downloads. Is it actually under
> ~/Downloads?

No, I think you misunderstood. The command I executed was: baloosearch -x ~/Scaricati/clickhouse. So, on the right directory I have in my home. What I translated for you is the output message of the command, which was originally given in italian.

My home does not contain any "Downlodas" or "Images" directory. All my directories are just the italian-named ones, which *are not* symlinks to english-named ones. They are just setup like this when creating a new user. I guess the mapping is obtained using the freedesktop facilities, such as the command: xdg-user-dir DOWNLOAD.

So, my Scaricati directory is a plain directory like any other in my home. Nothing special about that. The only thing is that if you ask what is the "Download" directory of my current user using:
xdg-user-dir DOWNLOAD
you will get /home/marco/Scaricati as expected.

I tried using the Immagini directory, which is the italian Pictures directory (indeed, xdg-user-dir PICTURES returns /home/marco/Immagini) to search with baloosearch and I get the same issue: I can find files in the Immagini directory if I search "my files", but nothing shows up if I search "from here".
Comment 15 Marco 2023-10-12 14:50:00 UTC
P.S.  I never executed the commands over a "Downloads" directory, and nowhere, the output of the commands I executed mentions "Downloads". I have never used such a directory, as it does not exist. I have just been calling this directory "Downloads" as we are all speaking in English here :).
Comment 16 tagwerk19 2023-10-12 15:19:23 UTC
(In reply to Marco from comment #14)
> No, I think you misunderstood.
Probably good that that's the case :-)

So:

    balooshow -x ~/Scaricati/clickhouse

gave you:

    2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse

And

    baloosearch -i clickhouse

works but

    baloosearch -d ~/Scaricati -i clickhouse

doesn't - and you see the same problem for the other "well known" directories.

It's still wierd and I'm not so sure what the next step is - what distribution are you using?
Comment 17 Marco 2023-10-12 15:23:04 UTC
(In reply to tagwerk19 from comment #16)
> (In reply to Marco from comment #14)
> > No, I think you misunderstood.
> Probably good that that's the case :-)
> 
> So:
> 
>     balooshow -x ~/Scaricati/clickhouse
> 
> gave you:
> 
>     2ad424f528e2274 1385046644 44909135 /home/marco/Scaricati/clickhouse
> 
> And
> 
>     baloosearch -i clickhouse
> 
> works but
> 
>     baloosearch -d ~/Scaricati -i clickhouse
> 
> doesn't - and you see the same problem for the other "well known"
> directories.
> 
> It's still wierd and I'm not so sure what the next step is - what
> distribution are you using?

Fedora 38. Plasma 5.27.8, KDE Frameworks 5.110.0, Qt 5.15.10
Comment 18 tagwerk19 2023-10-12 17:21:12 UTC
(In reply to Marco from comment #17)
> Fedora 38. Plasma 5.27.8, KDE Frameworks 5.110.0, Qt 5.15.10
I've set up the same, I think, F38 with the it_IT locale, but don't get to see the same issue...

Maybe you can attach a couple of configs?

    .config/baloofilerc
    .config/user-dirs.dirs

This is really "just in case" they give a clue.
Comment 19 Marco 2023-10-13 07:00:03 UTC
(In reply to tagwerk19 from comment #18)
> (In reply to Marco from comment #17)
> > Fedora 38. Plasma 5.27.8, KDE Frameworks 5.110.0, Qt 5.15.10
> I've set up the same, I think, F38 with the it_IT locale, but don't get to
> see the same issue...
> 
> Maybe you can attach a couple of configs?
> 
>     .config/baloofilerc
>     .config/user-dirs.dirs
> 
> This is really "just in case" they give a clue.

cat .config/baloofilerc:

[General]
dbVersion=2
exclude filters version=5
exclude folders[$d]
folders[$e]=$HOME/,$HOME/
only basic indexing=true

cat .config/user-dirs.dirs:

# This file is written by xdg-user-dirs-update
# If you want to change or add directories, just edit the line you're
# interested in. All local changes will be retained on the next run.
# Format is XDG_xxx_DIR="$HOME/yyy", where yyy is a shell-escaped
# homedir-relative path, or XDG_xxx_DIR="/yyy", where /yyy is an
# absolute path. No other format is supported.
# 
XDG_DESKTOP_DIR="$HOME/Scrivania"
XDG_DOWNLOAD_DIR="$HOME/Scaricati"
XDG_TEMPLATES_DIR="$HOME/"
XDG_PUBLICSHARE_DIR="$HOME/"
XDG_DOCUMENTS_DIR="$HOME/Documenti"
XDG_MUSIC_DIR="$HOME/Musica"
XDG_PICTURES_DIR="$HOME/Immagini"
XDG_VIDEOS_DIR="$HOME/Video"
Comment 20 Marco 2023-10-13 07:05:13 UTC
Might this be related: https://bugs.kde.org/attachment.cgi?id=152991&action=edit ?

My configuration looks similar.
Comment 21 Marco 2023-10-13 07:12:31 UTC
Created attachment 162267 [details]
Double HOME entries
Comment 22 Marco 2023-10-13 07:12:56 UTC
UPDATE: It seemed that the baloofilerc config file had two occurrences of $HOME in the included folders. I tried to remove one of the entries, purged the baloo index, and restarted indexing. It seems now Dolphin (and baloosearch as well) is able to find files in the Download directory starting "from here".

I think this messed up config file had to do with the fact that Fedora does not index the home directory by default, and for some reasons, when adding the home folder explicitly (at least via the GUI), it always gives me two entries of the home directory: one is marked as being indexed, and the other is not.

I think this is still some kind of bug on the KDE side, because if I open the GUI now, although my config file has only one entry of the HOME folder, it still shows two entries, as you can see the in the new image I attached to this thread: https://bugs.kde.org/attachment.cgi?id=162267
Comment 23 tagwerk19 2023-10-13 16:05:29 UTC
(In reply to Marco from comment #22)
> UPDATE: It seemed that the baloofilerc config file had two occurrences of
> $HOME in the included folders. I tried to remove one of the entries, purged
> the baloo index, and restarted indexing. It seems now Dolphin (and
> baloosearch as well) is able to find files in the Download directory
> starting "from here".
That is good news...

I know that in Fedora you can "get stuck" with $HOME being both included and excluded. Seems that the issue has been kicked around in Bug 429910 and Bug 439403. I remember a comment elsewhere that Fedora did something extra to support it's default of indexing Documents, Pictures, Music and Videos. Alas I cannot find that comment.

Alas though I cannot reproduce the issue in my installed-for-the-test Fedora.

Can we say this is now OK to close?
Comment 24 Marco 2023-10-16 06:27:47 UTC
(In reply to tagwerk19 from comment #23)
> (In reply to Marco from comment #22)
> > UPDATE: It seemed that the baloofilerc config file had two occurrences of
> > $HOME in the included folders. I tried to remove one of the entries, purged
> > the baloo index, and restarted indexing. It seems now Dolphin (and
> > baloosearch as well) is able to find files in the Download directory
> > starting "from here".
> That is good news...
> 
> I know that in Fedora you can "get stuck" with $HOME being both included and
> excluded. Seems that the issue has been kicked around in Bug 429910 and Bug
> 439403. I remember a comment elsewhere that Fedora did something extra to
> support it's default of indexing Documents, Pictures, Music and Videos. Alas
> I cannot find that comment.
> 
> Alas though I cannot reproduce the issue in my installed-for-the-test Fedora.
> 
> Can we say this is now OK to close?

To be honest, I do not see where Fedora is interfering here, besides some custom configurations. Now that I changed the configuration manually, I was expecting the GUI to show only one home directory being indexed. Isn't this a bug in KDE? (unless Fedora patches some code)
Comment 25 Marco 2023-10-16 06:33:03 UTC
(In reply to tagwerk19 from comment #23)
> (In reply to Marco from comment #22)
> > UPDATE: It seemed that the baloofilerc config file had two occurrences of
> > $HOME in the included folders. I tried to remove one of the entries, purged
> > the baloo index, and restarted indexing. It seems now Dolphin (and
> > baloosearch as well) is able to find files in the Download directory
> > starting "from here".
> That is good news...
> 
> I know that in Fedora you can "get stuck" with $HOME being both included and
> excluded. Seems that the issue has been kicked around in Bug 429910 and Bug
> 439403. I remember a comment elsewhere that Fedora did something extra to
> support it's default of indexing Documents, Pictures, Music and Videos. Alas
> I cannot find that comment.
> 
> Alas though I cannot reproduce the issue in my installed-for-the-test Fedora.
> 
> Can we say this is now OK to close?

Moreover, in the bug you linked, they use Manjaro, so, it does not seem to be related to Fedora specifically. Maybe we can set my report as a duplicate?
Comment 26 Kristen McWilliam 2023-11-01 19:28:10 UTC
src/tools/baloosearch/main.cpp


Number of arguments: 2
arg[0] = /home/merritt/Development/kde/build/baloo/bin/baloosearch6
arg[1] = avatar.jpg
query.toJSON(): {"searchString":"avatar.jpg"}
query.toSearchUrl(): baloosearch:?json=%7B%22searchString%22:%22avatar.jpg%22%7D
/home/merritt/Downloads/balootest.txt
/home/merritt/Downloads/avatar.jpg
Elapsed: 1.93772 msecs


Number of arguments: 4
arg[0] = /home/merritt/Development/kde/build/baloo/bin/baloosearch6
arg[1] = -d
arg[2] = /home/merritt/Downloads
arg[3] = avatar.jpg
query.toJSON(): {"includeFolder":"/home/merritt/Downloads","searchString":"avatar.jpg"}
query.toSearchUrl(): baloosearch:?json=%7B%22includeFolder%22:%22/home/merritt/Downloads%22,%22searchString%22:%22avatar.jpg%22%7D
Elapsed: 19.443 msecs


*****


I tried to debug the actual implementations, especially in `SearchStore::constructQuery`, `SearchStore::exec`, and `Transaction::postingIterator`; but the code there made my brain melt so all I have is some values extracted from baloosearch.

This does show the error as I am experiencing it, as I have a `avatar.jpg` in my ~/Downloads folder, and I created a text file `~Downloads/balootest.txt` with the content of `avatar.jpg` for the purposes of the test. We can see a plain search finds both, and a search restricted to the Downloads directory finds nothing.

Popping the `query.toSearchUrl` into Dolphin reveals the same thing, as expected.
Comment 27 tagwerk19 2023-11-01 22:51:31 UTC
Maybe I'm a step further. I have a system where I see the same behaviour, something I consider a remarkable step forward 8-/

Bad news is it's a "bashed about" Neon Unstable, not something I can reproduceably rebuild. Further bad news is that if I leave the configuration and test files untouched, do a "balooctl6 purge" and reindex, the behaviour goes away.

That would help explain why the issue is so slippery. It also means that when you have the issue, copy your .local/share/baloo/index somewhere safe so you can go back to it.

It's also quite possible that this is one cause of the problem rather then "the" cause, but what I see is:

   $ baloosearch -i  test
    144044ed0da2dd /home/test/testfiles/test.txt
    14ba610000fc01 /home/test/testfiles
    14ba61ed0da2dd /home/test/testfiles
    Elapsed: 0.297092 msecs

    $ baloosearch -i -d /home/test/testfiles test
    14ba610000fc01 /home/test/testfiles
    Elapsed: 0.62894 msecs
    
    $ baloosearch -i -d /home/test test
    14ba610000fc01 /home/test/testfiles
    Elapsed: 0.284228 msecs

    $ baloosearch -i -d /home/ test
    14ba610000fc01 /home/test/testfiles
    Elapsed: 0.278637 msecs

    $ balooctl6 purge

    $ baloosearch -i test
    144044ed0da2dd /home/test/testfiles/test.txt
    14ba61ed0da2dd /home/test/testfiles
    Elapsed: 0.410179 msecs

    $ baloosearch -i -d /home/test/testfiles test
    144044ed0da2dd /home/test/testfiles/test.txt
    14ba61ed0da2dd /home/test/testfiles
    Elapsed: 0.307979 msecs

The index has two records for the /home/test/testfiles folder with different DocIDs and the "test.txt" file was created with the new format DocID and it was not found in the search. After the reindex, it was found.
Comment 28 tagwerk19 2023-11-02 09:05:15 UTC
Installing baloo-checkdb.py
    https://bugs.kde.org/show_bug.cgi?id=472197#c2

Gives
    $ ./baloo-checkdb.py
    Loading DB from /home/test/.local/share/baloo/index...
    Unique documents: 3
    Unique terms: 14
    Checking connectivity of IdTreeDB...
    Checking whether parent[children[docid]] == docid...
    Checking whether docid is in children[parent[docid]]...
    Checking if documents are present in all the databases...
    WARNING: following documents were not found in docterms:
     - 5618512689996509 (/home)
     - 5618508712967169 (/home)
     - 5713487301812957 (/home/test)
     - 5713483324783617 (/home/test)
    WARNING: following documents were not found in docfilenameterms:
     - 5618512689996509 (/home)
     - 5618508712967169 (/home)
     - 5713487301812957 (/home/test)
     - 5713483324783617 (/home/test)
    Checking whether posting[docterms[docid]] contains docid (can take some time)...
    Checking for consistency of PostingDB (can take some time)...
    WARNING: term  points to unknown document 5698695752383489 (/???)
Comment 29 Kristen McWilliam 2023-11-02 15:03:33 UTC
Interesting....


❯ baloosearch -i avatar.jpg
1c73f77ac33fdc9 /home/merritt/Downloads/balootest.txt
1c44735ac33fdc9 /home/merritt/Downloads/avatar.jpg
Elapsed: 1.09867 msecs
❯ baloosearch -i -d /home/merritt/Downloads avatar.jpg
Elapsed: 51.9425 msecs



❯ ./baloo-checkdb.py
Loading DB from /home/merritt/.local/share/baloo/index...
Unique documents: 3
Unique terms: 296550
Checking connectivity of IdTreeDB...
ERROR: 162704729950978818 (/???/pubspec.lock) has dangling parent 162704712771109634 (/???)
ERROR: 162704734245946114 (/???/pubspec.yaml) has dangling parent 162704712771109634 (/???)
ERROR: 162704760015749890 (/???/build) has dangling parent 162704712771109634 (/???)
ERROR: 163843793932583682 (/???/linux) has dangling parent 162704712771109634 (/???)
ERROR: 164981427690078978 (/???/macos) has dangling parent 162704712771109634 (/???)
ERROR: 182410404977246978 (/???/gutter) has dangling parent 182410190228882178 (/???)
ERROR: 182410417862148866 (/???/performance-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410422157116162 (/???/template_new_project.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410426452083458 (/???/memory_dashboard@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410430747050754 (/???/template_new_module.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410435042018050 (/???/general) has dangling parent 182410190228882178 (/???)
ERROR: 182410563891036930 (/???/trackwidget-dgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410568186004226 (/???/baselines-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410572480971522 (/???/custom) has dangling parent 182410190228882178 (/???)
ERROR: 182410645495415554 (/???/cancel@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410649790382850 (/???/trackwidget-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410654085350146 (/???/reload_debug.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410658380317442 (/???/guidelines-dgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410662675284738 (/???/images-dgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410666970252034 (/???/template_new_plugin.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410671265219330 (/???/Icon-192.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410675560186626 (/???/flutter_13.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410679855153922 (/???/slow-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410684150121218 (/???/flutter_inspect.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410688445088514 (/???/flutter_64.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410692740055810 (/???/bazel_run@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410697035023106 (/???/images-lgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410701329990402 (/???/widget-select-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410705624957698 (/???/flutter@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410709919924994 (/???/cancel.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410714214892290 (/???/flutter_test@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410718509859586 (/???/repaints-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410722804826882 (/???/template_new_package.png) has dangling parent 182410190228882178 (/???)
ERROR: 182410727099794178 (/???/perf) has dangling parent 182410190228882178 (/???)
ERROR: 182411010567635714 (/???/actions) has dangling parent 182410190228882178 (/???)
ERROR: 182411027747504898 (/???/refresh-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411032042472194 (/???/guidelines-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411036337439490 (/???/hot-restart@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411040632406786 (/???/phone.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411044927374082 (/???/flutter.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411049222341378 (/???/flutter_64@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411053517308674 (/???/inspector) has dangling parent 182410190228882178 (/???)
ERROR: 182411384229790466 (/???/feedback.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411388524757762 (/???/widget-select-dgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411392819725058 (/???/observatory_overflow.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411397114692354 (/???/timeline.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411401409659650 (/???/observatory@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411405704626946 (/???/reload_both.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411409999594242 (/???/repaints-lgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411414294561538 (/???/baselines-dgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411418589528834 (/???/reload_run@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411422884496130 (/???/feedback@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411427179463426 (/???/widget-select-lgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411431474430722 (/???/slow-dgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411435769398018 (/???/trackwidget-lgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411440064365314 (/???/guidelines-lgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411444359332610 (/???/bazel_run.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411448654299906 (/???/baselines-lgrey.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411452949267202 (/???/debug_banner@2x.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411457244234498 (/???/reload_run.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411461539201794 (/???/hot-restart-white.png) has dangling parent 182410190228882178 (/???)
ERROR: 182411465834169090 (/???/assets) has dangling parent 182410052789928706 (/???)
ERROR: 182411693467435778 (/???/FontManifest.json) has dangling parent 182410052789928706 (/???)
ERROR: 182411697762403074 (/???/index.html) has dangling parent 182409919645942530 (/???)
ERROR: 182411702057370370 (/???/dartdoc) has dangling parent 182409915350975234 (/???)
ERROR: 212232982843425538 (/???/Compose) has dangling parent 212232978548458242 (/???)
ERROR: 213925066814063362 (/???/windows) has dangling parent 162704712771109634 (/???)
Checking whether parent[children[docid]] == docid...
Traceback (most recent call last):
  File "/home/merritt/Downloads/./baloo-checkdb.py", line 265, in <module>
    db.check()
  File "/home/merritt/Downloads/./baloo-checkdb.py", line 242, in check
    self._check_parent_of_children()
  File "/home/merritt/Downloads/./baloo-checkdb.py", line 166, in _check_parent_of_children
    print(f"ERROR: parent of {self._docrepr(curchildid)} is unknown!")
                                            ^^^^^^^^^^
NameError: name 'curchildid' is not defined. Did you mean: 'curchild'?
Comment 30 tagwerk19 2023-11-14 21:36:08 UTC
Created attachment 163168 [details]
Baloo example "index"

The stripped down "example" Baloo database file related to Comment 27 and Comment 28 (reconstructed so create/modified times may differ)
Comment 31 Stefan Brüns 2023-11-16 00:52:57 UTC
(In reply to tagwerk19 from comment #27)
> Maybe I'm a step further. I have a system where I see the same behaviour,
> something I consider a remarkable step forward 8-/
> 
> Bad news is it's a "bashed about" Neon Unstable, not something I can
> reproduceably rebuild. Further bad news is that if I leave the configuration
> and test files untouched, do a "balooctl6 purge" and reindex, the behaviour
> goes away.
> 
> That would help explain why the issue is so slippery. It also means that
> when you have the issue, copy your .local/share/baloo/index somewhere safe
> so you can go back to it.
> 
> It's also quite possible that this is one cause of the problem rather then
> "the" cause, but what I see is:
> 
>    $ baloosearch -i  test
>     144044ed0da2dd /home/test/testfiles/test.txt
>     14ba610000fc01 /home/test/testfiles
>     14ba61ed0da2dd /home/test/testfiles
>     Elapsed: 0.297092 msecs
> 
>   [...]
>
>     $ baloosearch -i test
>     144044ed0da2dd /home/test/testfiles/test.txt
>     14ba61ed0da2dd /home/test/testfiles
>     Elapsed: 0.410179 msecs
> 
>     $ baloosearch -i -d /home/test/testfiles test
>     144044ed0da2dd /home/test/testfiles/test.txt
>     14ba61ed0da2dd /home/test/testfiles
>     Elapsed: 0.307979 msecs
> 
> The index has two records for the /home/test/testfiles folder with different
> DocIDs and the "test.txt" file was created with the new format DocID and it
> was not found in the search. After the reindex, it was found.

("WARNING: term  points to unknown document 5698695752383489 (/???)" - that is 143eef0000fc01 in hex, so it points to the same problem)

@tagwerk19 Thanks for investigating this.

I think the conclusion this is caused by the change of the FSID generation (https://invent.kde.org/frameworks/baloo/-/merge_requests/131) is spot on, and shows that the MR is missing the required safety measures. (Although the MR mentioned this problem explicitly.)

The change was obviously not tested and analyzed for possible side effect. But my reservations were unfortunately just ignored, and the change merged nevertheless, without prior notice (24 minutes from removing the Draft status, "resolving" all comments and merging it).

After the change, the DB contents are semantically garbage, and the DB has to be purged (You have noticed this before, I know). This should have been done automatically, but nobody implemented it. Nobody announced the breakage to the broader public.
Comment 32 tagwerk19 2023-11-21 12:24:10 UTC
(In reply to Stefan Brüns from comment #31)
> ... the DB contents are semantically garbage ...
Whups...
    ... I replied here https://bugs.kde.org/show_bug.cgi?id=477068#c1
I think the change has highlighted something that was there for a while.

> ... Nobody announced the breakage to the broader public. ...
I did post something to discuss.kde.org:
    https://discuss.kde.org/t/baloo-and-frameworks-5-111/6348/1
Enough? Probably not 8-)
Comment 33 tagwerk19 2024-01-12 17:48:32 UTC
Maybe:
    https://invent.kde.org/frameworks/baloo/-/merge_requests/131#note_848831
Seems not to affect a rebuilt-from-scratch index (as a 'balooctl purge' seems to resolve the issue) but if the index holds both old and new document Id's then 'baloosearch -i -d folder ...' can fail.