Bug 435383

Summary: baloosearch in dolphin doesn't work in a symlinked directory
Product: [Frameworks and Libraries] frameworks-baloo Reporter: Sadi <sadiyumusak>
Component: generalAssignee: Stefan Brüns <stefan.bruens>
Status: RESOLVED DUPLICATE    
Severity: minor CC: baloo-bugs-null, jan.rathmann, nate, nivaca, tagwerk19
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Other   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=442786
https://bugs.kde.org/show_bug.cgi?id=446715
https://bugs.kde.org/show_bug.cgi?id=447119
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: dolphin search failure in symlinked folders

Description Sadi 2021-04-05 13:02:28 UTC
SUMMARY
When I want to perform a search in a symlinked directory, e.g. "~/.local/share/bin" which points to "~/.local/bin" I can't get any results.

STEPS TO REPRODUCE
1. Create a symlink pointing to a directory containing files 
2. Start a search in that symlinked directory for a pattern that must have a match
3. Repeat the search in the real directory

OBSERVED RESULT
Baloo shows no matches in the symlinked directory although it does show in the real directory.

EXPECTED RESULT
Baloo should be able to see that this is is a symlinked directory, and look for the match pattern in the real directory it points to.

SOFTWARE/OS VERSIONS
Operating System: KDE neon 5.21
KDE Plasma Version: 5.21.3
KDE Frameworks Version: 5.80.0
Qt Version: 5.15.2
Comment 1 Jan Rathmann 2021-04-13 10:44:18 UTC
I'm seeing the same in Kubuntu 21.04 (Plasma 5.21.4, Dophin 20.12.3). I'm not sure if this is more a problem of Baloo or a problem of Dolphin.

This is a regression compared with Kubuntu 20.10 (Plasma 5.19, Dolphin 20.08.2).

My personal use case for this is that there are a couple of symlinked dirs in my home folder pointing to the mount of an additional bigger HDD (e.g. ~/photos --> /mnt/big-hdd/photos) where doing a file search in Dolphin now gives no results.

For me, the following is a helpful workaround to make the search for those symlinked dirs work again:
1. In Systemsettings -> Search, explicitly include all directory symlinks to be indexed.
2. Run 'balooctl purge' afterwards.
(I haven't managed to get Baloo to actually include the content of the symlinked dirs without purging the index, for reasons unknown. Tried 'balooctl check' and 'balooctl index SYMLINK', this didn't work.)
Comment 2 tagwerk19 2021-04-13 14:59:01 UTC
Always difficult to write "I cannot get it to do that" :-/

One thing to check...

There was a post in Bug 433204 flagging that baloo can hit a "max_user_watches" limit that can be "ridiculously low"...

It's worth checking with
    sysctl fs.inotify.max_user_watches
and watch out if you have more "folders" than the given "max_user_watches"

You can increase the limit with:
    https://bugs.kde.org/show_bug.cgi?id=433204#c12
Comment 3 Jan Rathmann 2021-04-13 15:49:06 UTC
Yeah I already ran into the problem with Baloo hitting the "max_user_watches" limit in the past, and have increased that limit since then.
But that doesn't seem to be the cause for the problems mentioned in this report, at least on my system, since I can't find any suspicious messages in my syslog about Baloo hitting the "max_user_watches" limit.
Comment 4 tagwerk19 2021-04-13 16:07:49 UTC
OK, next thought....

If you are searching with baloo, the dolphin search dialog looks like

    https://bugsfiles.kde.org/attachment.cgi?id=137169

If baloo hasn't indexed that particular folder the dolphin dialogs looks like 

    https://bugsfiles.kde.org/attachment.cgi?id=137170

I tried setting up something similar to you and indexed (just) the symlink $HOME/photos.

When I moved to "~/photos" and searched there, I got the "dolphin using baloo" dialog; when I moved to "/mnt/big-hdd/photos" I got the "without baloo" dialog.

The two differ and the "gotcha" is, if you are searching for content, you are reading through each file in turn. Something that is notably slower, I imagine even more so with a HDD, and you might easily miss the small "searching" animation at the bottom right of the window.
Comment 5 Sadi 2021-04-13 16:22:36 UTC
(In reply to tagwerk19 from comment #2)
> There was a post in Bug 433204 flagging that baloo can hit a
> "max_user_watches" limit that can be "ridiculously low"...
> It's worth checking with
>     sysctl fs.inotify.max_user_watches
> and watch out if you have more "folders" than the given "max_user_watches"
> You can increase the limit with:
>     https://bugs.kde.org/show_bug.cgi?id=433204#c12

Thanks for this tip.
I have this: fs.inotify.max_user_watches = 1048576
This means over a million folders, which I doubt needs to be raised in my case.
Nevertheless, I'll make sure that baloo is not somehow including my Timeshift backups under five snapshot folders in a partition which hosts my data folders as well (Documents, Pictures, etc.).
Also, I think baloo should better notify user somehow it has faced such a limit and failed to index all files.
I don't understand why baloo refuses to index files in a symlinked folder - at least when their content is not included.
Comment 6 Jan Rathmann 2021-04-13 17:18:23 UTC
(In reply to tagwerk19 from comment #4)
> ... 
> The two differ and the "gotcha" is, if you are searching for content, you
> are reading through each file in turn. Something that is notably slower, I
> imagine even more so with a HDD, and you might easily miss the small
> "searching" animation at the bottom right of the window.

Ok I verified that Dolphin is actually using Baloo to search in the symlinked dirs, so it seems that accidentaly mixing up "search with Baloo" and "search without Baloo" in Dolphin isn't the problem here. As a side note: if I search in a symlinked dir that is explicitly _not_ included in Baloo's indexing, Dolphin uses the "without Baloo" search and that works as expected.
Comment 7 tagwerk19 2021-04-13 22:35:57 UTC
(In reply to Jan Rathmann from comment #6)
> ... As a side
> note: if I search in a symlinked dir that is explicitly _not_ included in
> Baloo's indexing, Dolphin uses the "without Baloo" search and that works as
> expected ...
Thanks, it is reassuring to have some "corroboratory evidence"!
Comment 8 tagwerk19 2021-04-13 22:42:24 UTC
(In reply to Sadi from comment #5)
> Also, I think baloo should better notify user somehow it has faced such a
> limit and failed to index all files.
Yes, that's sensible. Particularly as, as far as I can tell, once baloo hits the "inotify limit", it doesn't get told of any changes.

> I don't understand why baloo refuses to index files in a symlinked folder -
> at least when their content is not included.
Some digging is probably required..

I know there was an issue with hidden folders (Bug 431588, sorted in 5.79). It would make sense to check whether the issue appears with "normal" symlinks to "normal" folders - and try out the different variations.

I'm looking with

    Neon Testing
    Plasma: 5.21.4
    Frameworks: 5.81.0
    Qt: 5.15.2 

and symlinks seem to work... or, to say things more carefully, I've not stumbled on a way they break.
Comment 9 tagwerk19 2021-04-14 06:58:47 UTC
If you have not tried "strace", it is magic.

It shows you the system calls a process makes. That means you should be able to see "baloo_file" finding and reading the indexed directories, asking for a iNotify watch and being told when iNotify sees new or changed file. The logged data is "somewhat obscure" and some guesses can be required to work out what is happening but it could say where the indexing process is breaking.

What I've not found is a way of watching the second stage in the indexing, when the file content is extracted (baloo_file makes a list of "everything needing indexing", which can be a very quick process, and then asks "baloo_file_extractor" to look at the content of each file)

When I've used it, I've killed the running copy of baloo_file and then run

    strace -o baloo-strace.log baloo_file

from the command line.

Another other trick is "balooshow -x <file>", that looks up details of the indexed file in the database. You can try looking up details for the file 'as per' its actual folder and via the symlinked path. I get details when I try both and also see "knows of" the Symlinked path:

    $ balooshow -x Testdir/testfile.txt
    3ffab0000fc01 64513 262059 Testdir/testfile.txt [/home/test/Symlink/testfile.txt]

    $ balooshow -x Symlink/testfile.txt 
    3ffab0000fc01 64513 262059 Symlink/testfile.txt [/home/test/Symlink/testfile.txt]
Comment 10 Sadi 2021-04-14 08:27:45 UTC
Created attachment 137582 [details]
dolphin search failure in symlinked folders
Comment 11 Sadi 2021-04-14 08:33:49 UTC
(In reply to Jan Rathmann from comment #6)
> (In reply to tagwerk19 from comment #4)
> note: if I search in a symlinked dir that is explicitly _not_ included in
> Baloo's indexing, Dolphin uses the "without Baloo" search and that works as
> expected.

I didn't know that Dolphin can also do a "without baloo" search!
In my system (with default settings in this respect) I create a symlink of a folder where I get search results as expected, I try to do a search here, and Dolphin still uses baloosearch, returns no results. Search toolbar is exactly the same. I wonder if I'm also missing something so that I can get Dolphin do a search without baloo, when baloo cannot do it.
Comment 12 Jan Rathmann 2021-04-14 10:56:10 UTC
(In reply to tagwerk19 from comment #8)
> I'm looking with
> 
>     Neon Testing
>     Plasma: 5.21.4
>     Frameworks: 5.81.0
>     Qt: 5.15.2 
> 
> and symlinks seem to work... or, to say things more carefully, I've not
> stumbled on a way they break.

I made a test installation of neon-testing-20210413-1821.iso in VirtualBox and can perfectly reproduce the problem there:

1. mkdir ~/testdir
2. touch "~/testdir/my important document.txt" "~/testdir/another document.txt"
3. ln -s ~/testdir ~/symlink
4. Open Dolphin, try search on testdir: works, try search on symlink: no results.

When I run 'balooshow -x' I get the same output for the file in real dir and for the file in the symlink dir:

----------------------

jan@jan-virtualbox:~$ balooshow -x testdir/another\ file.txt
8706b00000801 2049 553067 testdir/another file.txt [/home/jan/testdir/another file.txt]
        Mtime: 1618395977 2021-04-14T12:26:17
        Ctime: 1618395977 2021-04-14T12:26:17

Internal Info
Terms: Mapplication Mx Mzerosize
File Name Terms: Fanother Ffile Ftxt
XAttr Terms:

----------------------

jan@jan-virtualbox:~$ balooshow -x symlink/another\ file.txt
8706b00000801 2049 553067 symlink/another file.txt [/home/jan/testdir/another file.txt]
        Mtime: 1618395977 2021-04-14T12:26:17
        Ctime: 1618395977 2021-04-14T12:26:17

Internal Info
Terms: Mapplication Mx Mzerosize
File Name Terms: Fanother Ffile Ftxt
XAttr Terms:

----------------------

When I run 'baloosearch another', I get only 'testdir/another file.txt' as output, not the symlinked one.
Comment 13 tagwerk19 2021-04-14 15:29:00 UTC
(In reply to Jan Rathmann from comment #12)
> I made a test installation of neon-testing-20210413-1821.iso in VirtualBox
> and can perfectly reproduce the problem there:
Interesting.

Yes, I do see that behaviour.

If I include the symlink explicitly:

    folders[$e]=$HOME/,$HOME/symlink/

in .config/baloofilerc then I get balooshow and baloosearch "preferring" symlink. I should go back to my test system and see if I missed that...

If you are using a new Neon Testing, it would be interesting if you also see Bug 435521...
Comment 14 Stefan Brüns 2021-04-14 15:45:36 UTC
(In reply to tagwerk19 from comment #13)
> (In reply to Jan Rathmann from comment #12)
> > I made a test installation of neon-testing-20210413-1821.iso in VirtualBox
> > and can perfectly reproduce the problem there:
> Interesting.
> 
> Yes, I do see that behaviour.
> 
> If I include the symlink explicitly:
> 
>     folders[$e]=$HOME/,$HOME/symlink/

Don't do that. Baloo currently only support one name per file (which is one of the reasons it does not follow symlinks).
Comment 15 tagwerk19 2021-04-14 16:18:15 UTC
(In reply to Stefan Brüns from comment #14)
> ... Baloo currently only support one name per file ...
I can see that makes sense, if you do a search you don't want 'x' hits back for a single file.

Having a "canonical name", as I think balooshow displays, seems a good solution...
Comment 16 tagwerk19 2021-04-14 16:40:59 UTC
(In reply to Sadi from comment #11)
> I didn't know that Dolphin can also do a "without baloo" search!
I think it was added to help people who have disabled baloo...

> ... I try to do a search here, 
> and Dolphin still uses baloosearch, returns no results. Search toolbar is
> exactly the same...
Dolphin, I seem to remember, looks to see if baloo is enabled and what folders it is supposed to be indexing.

From your screenshot, dolphin thinks it should be asking baloo in both cases. It's not getting back any matches for the second.

It may be a question of the HDD being thought of as removeable. Baloo doesn't, for example, index plug-in USB drives. I mounted the 'photos' disc through /etc/fstab
Comment 17 Nicolas Vaughan 2021-09-21 15:51:55 UTC
I can confirm the bug. No proposed solutions have worked.

Operating System: Manjaro Linux
KDE Plasma Version: 5.22.5
KDE Frameworks Version: 5.85.0
Qt Version: 5.15.2
Kernel Version: 5.14.2-1-MANJARO (64-bit)
Graphics Platform: X11
Comment 18 tagwerk19 2021-12-18 08:46:31 UTC
Created a summary bug for the Dolphin/Baloo "symlink troubles"; Bug 447119.

I think the originally reported issue here is covered by Case "1a" in the summary;  the issue can manifest in many different ways. If I've missed something, probably best to reopen this issue

*** This bug has been marked as a duplicate of bug 447119 ***