Bug 480001 - Add normal search as a fallback when searching in folder not indexed by baloo
Summary: Add normal search as a fallback when searching in folder not indexed by baloo
Status: REPORTED
Alias: None
Product: dolphin
Classification: Applications
Component: search (show other bugs)
Version: 23.08.4
Platform: Arch Linux Linux
: NOR wishlist
Target Milestone: ---
Assignee: Dolphin Bug Assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-18 16:37 UTC by Tom
Modified: 2024-01-19 10:32 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tom 2024-01-18 16:37:36 UTC
SUMMARY
If you search the internet for grief about Dolphin's search function, you will find a lot of hits for the same issue: https://www.google.com/search?hl=en&q=dolphin%20file%20search%20broken The indexed search with Baloo is not always reliable, leading to empty search results even in simple scenarios. I want to propose a change to remedy this at least in a few situations:
When search includes folders/files not indexed by baloo (For various reason: Not yet indexed, currently being indexed, excluded from indexing, etc), search these folders and files with the standard search. Merge the results from both searches and display these to the users. It might be prudent to display an info or warning to the user if searching will take longer because of this.

The use case in which I encountered this problem:
1. I downloaded a new folder with a huge amount of pictures
2. I searched for the simple and obvious name of a picture ("rome")
3. The file I was expecting ("rome arrival.jpg") didn't show up
4. When I checked the System Settings->Search window, Baloo was shown as "86% indexed, idle". 
5. I disabled Indexed Search and retried the search in Dolphin. The picture was immediately found with the exact same search query.


P.S.
My apologies if I got something wrong. I'm a first time KDE user and still learning :)
Comment 1 tagwerk19 2024-01-18 18:37:35 UTC
(In reply to Tom from comment #0)
> ... When search includes folders/files not indexed by baloo (For various reason:
> Not yet indexed, currently being indexed, excluded from indexing, etc),
> search these folders and files with the standard search ...
Searching with Dolphin is a labyrinth :-/

The "filenamesearch" and "baloosearch" backends don't pass queries back and forth between them (Here the filenamesearch is the code that does the "there and then" search, reading the data from the filesystem and giving you the "hits. "Baloosearch" queries a database that's built in the background.)

Ideally "filenamesearch" ought to be able to ask what baloo has indexed and query baloo for it (and therefore get results for these folders more quickly, and better results as baloo extracts and indexes the text of the files)

It's interesting to consider your perspective that a baloo query should do a "slow" search though the non-indexed folders.

There's a summary bug, listing the differences in behaviour between the two sorts of search here: Bug 463830

> 1. I downloaded a new folder with a huge amount of pictures
> 2. I searched for the simple and obvious name of a picture ("rome")
> 3. The file I was expecting ("rome arrival.jpg") didn't show up
You don't say what distro you are using, it's possible it makes a difference (particularly when dealing with loads of files)

You might find a "balooctl check" nudges Baloo to pick up the remaining files.

You are doing a search for a filename, that information ought to be indexed *remarkably quickly*. It is the searching through the content, opening each file, extracting the text that it would be useful to index, updating the database, that takes the time. You can use "balooctl monitor" to watch the stream of files being content indexed.

You might find that telling Baloo not to index content (just the filename and xattr metadata) would be enough for you.
Comment 2 Tom 2024-01-18 21:02:01 UTC
I'm using Arch. 
I repeated the same experiment with balooctl monitor running in the background. I re-enabled the indexed search in System Settings. I'm not sure how baloo reacts to a folder being deleted and recreated with the same name/content, so I chose a different directory name. I downloaded ~1500 pictures into this directory, which took about 10 Minutes. balooctl monitor was not showing anything but "indexer is idle" at this point. Then I searched for a picture name again. This time it showed up right away. I'm not sure which search was used for this now, though. So I used baloosearch on the command line. It didn't show any results for this new folder. So I assume it was not indexed. I ran balooctl check, but monitor did not show any news from the indexer. The system settings show the index at 84% as before and the indexer as idle.
Comment 3 tagwerk19 2024-01-18 22:53:23 UTC
(In reply to Tom from comment #2)
> ... I downloaded ~1500 pictures into this directory, which took about 10 Minutes ...
It sounds as if Baloo is not seeing the new files. 1500 pictures in 10 minutes should be absolutely no problem.

This is Arch with an ext4 filesystem, BTRFS or something more exotic?

It might be that the folder you are downloading into is not under your $HOME, possibly a separate partition? disk? remote drive?

You can troubleshoot by running "balooshow -x one-of-your-pictures.jpg", this will show the file metadata and various "embedded" tags. If this gives a result and baloosearch does not find it, then there's something wierd.
Comment 4 Tom 2024-01-18 23:28:14 UTC
This command returns "No index information found". balooctl monitor indicates the indexer is running. I noticed that my system settings freeze for a few seconds when I go to the search settings. Maybe that's related? It's still showing progress as 84% and the indexer as idle. (Monitor shows this as well now.)
My filesystem is ext4, and it's all on a rather fast Samsung nvme ssd. The folder is ~/Downloads/subfolder and /home/ is added to indexing in the System Settings. I have also specifically added ~/Downloads for this test.
Comment 5 tagwerk19 2024-01-19 10:32:01 UTC
That's most of the simple questions then :-)

Maybe see what

    systemctl status --user kde-baloo

says, check to see how big the baloo index is

    balooctl status

and then watch the processes with something like htop. You want to see whether baloo_file is being squeezed for CPU when it want to use too much memory (and constraints set in the unit file are limiting it)

See if you get anything in the journal, you can turn on debugging by creating a qtlogging.ini file:

    mkdir -p ~/.config/QtProject
    vi ~/.config/QtProject/qtlogging.ini

and adding

    [rules]
    kf.baloo=true

You can also ensure that any log or error messages sent to stderr get copied to the journal by editing the kde-baloo unit file:

    systemctl edit --user kde-baloo

and add line:

    [Service]
    StandardError=journal

and saving the override.