Bug 453291

Summary: baloosearch pattern not found when in middle of word
Product: [Frameworks and Libraries] frameworks-baloo Reporter: felix
Component: generalAssignee: baloo-bugs-null
Status: RESOLVED DUPLICATE    
Severity: normal CC: tagwerk19
Priority: NOR    
Version: 5.93.0   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description felix 2022-05-02 11:04:07 UTC
SUMMARY
Hey there,

I'm experiencing a problem with baloosearch, first experienced in dolphin. When searching for a pattern, the pattern is only recognized when occuring in the beginning, but not in the middle or the end of a word. In https://docs.kde.org/stable5/en/dolphin/dolphin/quick-tips.html it is stated that no wildcards are necessary because these are assumed already. However in my oberserved result baloosearch is not finding DefaultTestFile nor DefaulttestFile, which containg the pattern test. I'd have expected this. Am I doing something wrong? Or is something goofy going on here?

Best wishes

STEPS TO REPRODUCE
1. mkdir ~/test
2. cd ~/test
3. touch testFile TestFile DefaulttestFile DefaultTestFile
4. balooctl index *

5.1 baloosearch -d ~/test test
5.2 baloosearch -d ~/test filename:test

OBSERVED RESULT
Command output:

/home/user/test/testFile
/home/user/test/TestFile
/home/user/test


EXPECTED RESULT
Command output:

/home/user/test/DefaulttestFile
/home/user/test/DefaultTestFile
/home/user/test/testFile
/home/user/test/TestFile
/home/user/test

SOFTWARE/OS VERSIONS
Windows: ---
macOS: ---
Linux/KDE Plasma: 5.17.4-arch1-1/5.24.4-1
(available in About System)
KDE Plasma Version: 5.24.4-1
KDE Frameworks Version: 5.93.0
Qt Version: 5.15.3
Comment 1 tagwerk19 2022-05-03 06:21:26 UTC
(In reply to felix from comment #0)
> ... is something goofy going on here?
I think you'd probably say "yes" here...

As you've found with your baloosearch tests, baloo does not do wildcard searches. If you search for "amp" you will get results for "ample" but not "example" (so, you can think of the search being for "amp*" but not "*amp*").

Dolphin uses the Baloo index when it thinks it can, so you'll get the same behaviour in Dolphin. If Dolphin sees that Baloo is disabled or not indexing the folder it's in, it does it's *own* search.

The goofy part is that Dolphin's "own search" seems to pick up substrings; effectively an "*amp*" search.

Balancing this, Baloo does recognise word boundaries so if you had done your tests with "default-testfile" files, "baloosearch test" would have found them.

There's a bit of discussion in Bug 452628...
Comment 2 felix 2022-05-04 13:13:43 UTC
I can confirm the behaviour you describe in dolphin when baloo is disabled or no index database is existing. Speed's decent, however I'd loose the ability of looking for keywords in PDF's.

Also your point of word boundaries is valid. Unfortunately it does not seem appropriate to rename all files where words are made up of two words and I search for one part resulting in not finding it.

The bug report you mentioned does seem equal to mine. This raises the question of closing mine since there is already one. I would like to stress, however, the suggestion to implement a *word* wildcard style for baloo as well - for filenames and keywords in text based file formats.
Comment 3 tagwerk19 2022-05-05 05:50:22 UTC
(In reply to felix from comment #2)
> ... This raises the question of closing mine since there is already one. 
Yes, probably makes sense to close this as a duplicate.

> ... I would like to stress,
> however, the suggestion to implement a *word* wildcard style for baloo as
> well - for filenames and keywords in text based file formats.
I agree about how useful a wildcard search would be but, with the way that Baloo is built, it could be difficult. Baloo is designed to to be *fast*, pulling results off disc as you type characters into your search - you get a steadily refined set of results as you type. It's not clear to me, with the current architecture, how you could do that with wildcards...

On the other hand, maybe more powerful magic is possible, plocate is amazingly fast and provides wildcard searches, https://plocate.sesse.net/

*** This bug has been marked as a duplicate of bug 452628 ***