Bug 447119 - Summary: Dolphin/Baloo search with symlinks
Summary: Summary: Dolphin/Baloo search with symlinks
Status: REPORTED
Alias: None
Product: dolphin
Classification: Applications
Component: search (show other bugs)
Version: unspecified
Platform: Other Linux
: HI normal
Target Milestone: ---
Assignee: Dolphin Bug Assignee
URL:
Keywords:
: 333678 424871 435383 439438 442786 446715 447896 459572 (view as bug list)
Depends on:
Blocks:
 
Reported: 2021-12-17 10:37 UTC by tagwerk19
Modified: 2024-03-01 11:27 UTC (History)
19 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tagwerk19 2021-12-17 10:37:23 UTC
SUMMARY:

    A review of Dolphin/Baloo search issues with symlinks (summarising the different
    issues to allow duplicates to be closed)

BASE ISSUE:

    Baloo, when indexing, does not "follow" symbolic links and index the files and
    folders referenced. If you have not explicitly included the target folders
    "to be indexed", then a baloo search will not find the files.

    Baloo assumes there's a one-to-one mapping between the filename and the index's
    internal ID and can trip up if this is not the case.

This issue manifests itself in several ways - and there are three/four variables in play, two with the indexing:

    Where you've created the symlink (in a folder indexed by baloo or not?)

    Where the symlink is pointing (is the real/target folder being indexed by
    baloo or not?)

and then with the way you are searching:

    If you are searching "From Here" in Dolphin, does Dolphin think that "Here" is
    indexed by baloo or not?

    If searching "Your Files" (or "Everywhere") in Dolphin rather than under a
    particular folder with "From Here".

This means rather many test cases but fortunately not so many different real behaviours 8-]

    Note that Dolphin reads the list of folders indexed by baloo and queries baloo
    when it thinks baloo knows. If Dolphin thinks that baloo has not indexed the
    "needed" folders, it will do it's own "there and then" search. (The processes here
    are baloosearch and filenamesearch)

    This is a rabbit hole all of it's own, see the summary:
        https://bugs.kde.org/show_bug.cgi?id=424871#c4

OBSERVED RESULTS:

Case 1...

    Dolphin asks baloo for search results, the folder holding the symlink and the
    target folder are indexed.

    As an example, baloo is indexing your home directory and you've created a
    symlink in ~/Desktop to ~/Documents

    In this case the command line baloosearch and Dolphin's Ctrl-F search will
    return hits - and the files will be given with their canonical names (the
    real/target folders).

    In the example, the hits will be files under ~/Documents. All is good.

... 1a

    A watch point is, if you are in Dolphin, follow the symlink to get to
    ~/Desktop/Documents and search "From Here", you will not get any hits.
    Baloo has indexed ~/Documents and Dolphin is querying for results under
    ~/Desktop/Documents. Worse, Dolphin does not help you distinguish between
    the cases, both show searching "From Here (Documents)".

    Bug 333678, Bug 434610 (maybe), Bug 435383 and Bug 442786 are instances of this...

    Bug 442786 shows just how confusing this can be: if baloo is enabled you will
    not get any hits searching "From Here" whereas if baloo is disabled, Dolphin will
    do it's own filenamesearch and you *will* get hits.

Case 2...

    Dolphin asks baloo for search results, the folder holding the symlink is being
    indexed but the target is *not*.

    As an example, baloo is indexing your home directory, you've created a symlink
    in your home to a separate disk you've mounted as /media/morespace 

    In this case the target folders are not being indexed. Baloosearch and Dolphin's
    Ctrl-F search won't return anything

    This is confusing if you thought baloo followed the links and indexed the target
    directories and, as said, baloo doesn't do that. 

... 2a

    The solution is to add "/media/morespace" to the list of included folders in
    System Settings > Search (or by adding it to the folders[$e] line in
    .config/baloofilerc).

    When this done, searches will work and give the "Canonical names" as above.
    However maybe that's not quite what you're expecting (you want the hits to
    show the symlink and not dereference it to show the target file/folder. This
    expectation gets complicated if you have more than one symlink...)

    This solution also means that if you are in your Home Directory and search in
    Dolphin "From Here", you won't get hits from your "/media/morespace" folders.

    Alternative is to search "Your Files" ("Everywhere" of old and it's worth remembering
    that the simple command line "baloosearch searchterms" give you results
    from "Everywhere")

    Bug 439438 and Bug 446715 are instances of this... 

... 2b

    Empirically, it also seems possible to tell baloo to index the symlink. That is, to
    index ~/morespace rather than the target /media/morespace. It seems that querying
    baloo then gives the hits "as if" in under ~/morespace.

    However, in Bug 435383, it was said "Don't do that":

        https://bugs.kde.org/show_bug.cgi?id=435383#c14

Case 3...

    You've created a symlink in a folder that is not indexed by baloo.

    If you are in a folder not indexed by baloo, Dolphin will drop back
    to it's own "there and then" search, as mentioned in:

        https://bugs.kde.org/show_bug.cgi?id=424871#c4

    The challenge is to work out if Dolphin is asking baloo for the search
    results or not. Dolphin gives you a slight clue, if the search box looks like this:

        https://bugsfiles.kde.org/attachment.cgi?id=137169 

    then Dolphin is asking baloo (and you see that you can specify extra search
    criteria) whereas if it looks like this:
      
        https://bugsfiles.kde.org/attachment.cgi?id=137170

    then Dolphin will do its own filenamesearch.

    As an example, by default Fedora does not index your home directory, just
    the ~/Documents, ~/Music, ~/Pictures, ~/Videos folders.

    If you've created a symlink on your ~/Desktop pointing to ~/Documents and:

        You are in ~/Documents and searching "From Here":

             You'll be querying baloo and it will find the hits under ~/Documents and
             you'll see them in the Dolphin search

        You have followed your symlink to ~/Desktop/Documents (which is not indexed)
        and are searching "From Here":

             You'll do a recursive Dolphin filenamesearch and see results "under"
             ~/Desktop/Documents

        You are in your Home folder (also not indexed) and are searching "From Here":

             You'll do a recursive filenamesearch though your entire home directory
             (including following symlinks) and you'll get duplicated results from both
             ~/Documents and ~/Desktop/Documents 

    This is *difficult*. Bug 436737 is an example of the confusion. 

WISHED FOR RESULTS:

    Baloo should follow symlinks and index target folders (at least those mounted
    in /etc/fstab)

    Baloo/Dolphin searches should give the same result set, independent of whether
    the search "is from" the symlink or the target directory. The full filenames
    returned should probably reflect the "From Here"

    That is - searching from ~/morespace gives results under morespace, similarly
    if searching from your home directory. Searching from /media/morespace gives
    the results under there and similarly searching "Your Files" (or
    "Everywhere") returns results as per their real filename. There's an implication
    here that baloo is clever with symlinks, indexes the "real filenames" but can do
    searches based on the symlink.

    Dolphin filename searches, whether via baloosearch or falling back to filenamesearch,
    should give the same results.
Comment 1 tagwerk19 2021-12-18 08:46:31 UTC
*** Bug 435383 has been marked as a duplicate of this bug. ***
Comment 2 tagwerk19 2021-12-18 09:05:04 UTC
*** Bug 439438 has been marked as a duplicate of this bug. ***
Comment 3 tagwerk19 2021-12-18 09:13:32 UTC
*** Bug 442786 has been marked as a duplicate of this bug. ***
Comment 4 tagwerk19 2021-12-18 09:19:23 UTC
*** Bug 446715 has been marked as a duplicate of this bug. ***
Comment 5 tagwerk19 2021-12-18 09:34:46 UTC
*** Bug 424871 has been marked as a duplicate of this bug. ***
Comment 6 tagwerk19 2021-12-18 09:52:12 UTC
*** Bug 333678 has been marked as a duplicate of this bug. ***
Comment 7 tagwerk19 2022-01-05 13:08:39 UTC
*** Bug 447896 has been marked as a duplicate of this bug. ***
Comment 8 tagwerk19 2022-02-06 15:04:17 UTC
https://old.reddit.com/r/kde/comments/slfacb/search_result_bug_using_a_shortcut_will_mess_with/
 
>   Lets say I have a folder located in my pictures. And I have an shortcut to that
>   folder in my documents. So when I click the short cut, the directory is
>   "documents/custom-shortcut-link/" instead of "pictures/custom-folder-in-pictures"
>   
>   The work around for this at least is when you create a shortcut, is right click
>   an open space in dolphin. Create new -> Link to Location (url) instead of
>   Create new -> Link to File or Directory. This will still get you to your
>   location, but instead it puts you in "pictures/custom-folder-in-pictures"
Nice!

This helps where baloo has indexed the folders, you follow a symlink on the desktop and don't get any results when you search. "Case 1a" in the summary....

The "Link to Location (URL)" works like a change directory and not a symbolic link.

Baloo will not automatically index anything pointed to using "Link to Location", you would need to make sure the destination folder was included in Baloo's "included folders" list (as per Case 2a...)

In the case where Baloo has not indexed the folders, Dolphin does its own "there and then" recursive search and doesn't follow the "Link to Location" references (whereas it does follow symlinks, described in Case 3). There's a difference in behaviour here...
Comment 9 tagwerk19 2022-07-22 11:05:34 UTC
See also:
    The trouble with symbolic links
    https://lwn.net/Articles/899543/
Comment 10 Nathan Colinet 2022-10-04 06:56:00 UTC
*** Bug 459572 has been marked as a duplicate of this bug. ***
Comment 11 Stefan Brüns 2023-04-24 18:55:37 UTC
This is mostly a bug in dolphin, it should pass the canonical path to baloo.

When you use `baloosearch -d ...`, it will return results even when the specified directory path contains symlinks, as it resolves the specified directory to its canonical path.

In case you wonder why canonicalization is the callers (in this case dolphin) responsibility: Canonicalization is a potentially blocking file system operation. Dolphin can use KIO to resolve the path without blocking the UI, it may even already have the canonical path at hand.
Comment 12 Stefan Brüns 2023-04-24 19:03:02 UTC
> ls -l . ; readlink -f /home/stefan/Sources/testdata/symlink_parent/testdata/symlink ; \
>  baloosearch -v ; baloosearch -i -d /home/stefan/Sources/testdata/symlink_parent/testdata/symlink foo
> insgesamt 3960
> -rw-r--r-- 1 stefan users       6 18. Mär 15:44 foo2.txt
> -rw-r--r-- 1 stefan users       6 18. Mär 15:44 foo.txt
> -rw-r--r-- 1 stefan users       6 18. Mär 15:08 hello.txt
> drwxr-xr-x 1 stefan users      36  5. Apr 12:18 Readonly
> lrwxrwxrwx 1 stefan users       1 24. Apr 20:32 symlink -> .
> lrwxrwxrwx 1 stefan users       2 24. Apr 20:36 symlink_parent -> ..
> -rw-r--r-- 1 stefan users 1340684 18. Mär 15:19 test2.wav
> -rw-r--r-- 1 stefan users 1340684 18. Mär 15:22 test3.wav
> -rw-r--r-- 1 stefan users 1340684 18. Mär 15:19 test.wav
> -rw-r--r-- 1 stefan users      12 18. Mär 15:08 world.txt
> /home/stefan/Sources/testdata
> Baloo 5.105.0
> 12ccf1f0000002d /home/stefan/Sources/testdata/foo2.txt
> 12ccf010000002d /home/stefan/Sources/testdata/foo.txt
> Elapsed: 0,125005 msecs

Dolphin returns the correct result only when the path of the current directory is its canonical name.
Comment 13 Bug Janitor Service 2023-04-24 23:32:48 UTC
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/baloo/-/merge_requests/126
Comment 14 Stefan Brüns 2023-04-24 23:33:23 UTC
Git commit c40c6b2594d9b383e11ead53cc56ceaf9d51f62c by Stefan Brüns.
Committed on 24/04/2023 at 19:33.
Pushed by bruns into branch 'master'.

baloosearch: Inform the user when the specified dir is not canonical

The path supplied for any queries must use the canonical form. baloosearch
already uses the canonical form, but does so silently. Inform the user
when the actual used path differs, to aid debugging.

M  +8    -1    src/tools/baloosearch/main.cpp

https://invent.kde.org/frameworks/baloo/commit/c40c6b2594d9b383e11ead53cc56ceaf9d51f62c
Comment 15 Stefan Brüns 2023-04-24 23:34:22 UTC
Git commit d313aa5d0b4122aee26a1a2f7dab6054d7eb5cd1 by Stefan Brüns.
Committed on 24/04/2023 at 19:35.
Pushed by bruns into branch 'kf5'.

baloosearch: Inform the user when the specified dir is not canonical

The path supplied for any queries must use the canonical form. baloosearch
already uses the canonical form, but does so silently. Inform the user
when the actual used path differs, to aid debugging.
(cherry picked from commit c40c6b2594d9b383e11ead53cc56ceaf9d51f62c)

M  +8    -1    src/tools/baloosearch/main.cpp

https://invent.kde.org/frameworks/baloo/commit/d313aa5d0b4122aee26a1a2f7dab6054d7eb5cd1
Comment 16 tagwerk19 2023-04-26 09:38:04 UTC
(In reply to Stefan Brüns from comment #12)
> Dolphin returns the correct result only when the path of the current directory is its canonical name.
Meaning that Dolphin should follow any symlinks before doing a search and (silently) query baloo with the canonical path? Although I'm not sure whether Dolphin should then show the results relative to the symlink or with the canonical path.

It makes sense ...

    ... although it implies that the index *only* ever contains canonical paths.

Would also need to be careful that baloosearch and filenamesearch do the same thing.

Edge cases?

    $ balooshow -x  path-including-symlink
    $ balooctl index path-including-symlink
    $ balooctl clear path-including-symlink

and then evil baloofilerc includes such as

    folders[$e]=$HOME/Desktop/Documents

where you've got a symlink on your Desktop to your Documents folder?

I think the "balooctl index" (and "balooctl clear") does things properly but I suspect that the evil include does not...
Comment 17 Stefan Brüns 2023-04-26 22:30:43 UTC
(In reply to tagwerk19 from comment #16)
> (In reply to Stefan Brüns from comment #12)
> > Dolphin returns the correct result only when the path of the current directory is its canonical name.
> Meaning that Dolphin should follow any symlinks before doing a search and
> (silently) query baloo with the canonical path? Although I'm not sure
> whether Dolphin should then show the results relative to the symlink or with
> the canonical path.

The "Path" column should display the canonical path. After all, that's where the file is actually located. (And in case the symlink started on a different filesystem, also the place where the space is consumed).

This is consistent with how directory sizes are calculated, symlinks are not followed, otherwise you may count files several times.

> It makes sense ...
> 
>     ... although it implies that the index *only* ever contains canonical
> paths.
> 
> Would also need to be careful that baloosearch and filenamesearch do the
> same thing.
> 
> Edge cases?
> 
>     $ balooshow -x  path-including-symlink

balooshow already handles this correctly, the canonical path is shown in square brackets (if it differs from the specified path):

> balooshow -x ~/Sources/testdata/symlink_parent/testdata/hello.txt 
> 12ccd4a0000002d 45 19713354 /home/stefan/Sources/testdata/symlink_parent/testdata/hello.txt [/home/stefan/Sources/testdata/hello.txt]

>     $ balooctl index path-including-symlink
>     $ balooctl clear path-including-symlink
> 
> and then evil baloofilerc includes such as
> 
>     folders[$e]=$HOME/Desktop/Documents
> 
> where you've got a symlink on your Desktop to your Documents folder?
> 
> I think the "balooctl index" (and "balooctl clear") does things properly but
> I suspect that the evil include does not...

In case it is not, this should be trivial to fix.

Also, if you break it, you may keep both parts.
Comment 18 cybea 2023-09-06 17:56:17 UTC
It would be really great if this could be fixed (especially case 1a) since baloo search feels broken if there are files you know of but nothing is displayed. Doing a "dolphin filenamesearch" as a fallback could be considered as well.
Maybe anyone can increase the importance of this bug?

(To me this looks like many bugs and having only one bug report could make tracking and prioritizing harder. But I appreciate having this overview.)
Comment 19 tagwerk19 2023-09-08 06:30:21 UTC
(In reply to cybea from comment #18)
> ... To me this looks like many bugs and having only one bug report could make
> tracking and prioritizing harder. But I appreciate having this overview ...
I can see that, what was happening though was we had been drowning under a wave of overlapping reports.

If there's a new report, we have the choice of flagging it as a duplicate or leaving it and cross-referencing with a "see also".

There is, I think, the possibility of creating a  "tracking bug" that gets resolved when all the component issues are resolved. I've no experience of these and am not sure whether it would be appropriate here...