Bug 378726 - baloosearch doesn't find certain substrings
Summary: baloosearch doesn't find certain substrings
Status: CONFIRMED
Alias: None
Product: frameworks-baloo
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.32.0
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: Pinak Ahuja
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-04-13 03:36 UTC by Munzir Taha
Modified: 2021-07-12 15:23 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Munzir Taha 2017-04-13 03:36:24 UTC
Though the file is indexed, baloo doesn't return the expected result!

> cat baloo_bug_378690 
hello_baloo
> balooctl status baloo_bug_378690 
File: /home/arch/baloo_bug_378690
Basic Indexing: Done
Content Indexing: Done
> baloosearch hello_baloo
Elapsed: 0.130105 msecs
Comment 1 Nate Graham 2017-10-27 16:42:12 UTC
Works for me using more recent software (KDE Frameworks 5.39.0):

$ cat baloo_bug_378690
hello_baloo

$ balooctl status baloo_bug_378690
File: /home/nate/baloo_bug_378690
Basic Indexing: Done
Content Indexing: Done

$ baloosearch hello_baloo
/home/nate/baloo_bug_378690
Elapsed: 0.256937 msecs

Does this still happen with KDE Frameworks 5.39.0? If so, Can you try turning baloo off and then back on again? and see if that wakes it up?
Comment 2 Munzir Taha 2017-10-27 19:49:43 UTC
There is an inconsitency in the results. nategraham_not worked but nategraham_no doesn't

~> pacman -Q baloo
baloo 5.39.0-1

~> cat baloo_378690 
nategraham_notnategraham

~> baloosearch nategraham_not
/home/munzir/baloo_378690
Elapsed: 136.177 msecs

~> baloosearch nategraham_no
Elapsed: 0.471143 msecs
Comment 3 Nate Graham 2017-10-27 20:53:52 UTC
Weird, that works for me, too:

$ baloosearch hello_baloo
/home/nate/baloo_bug_378690
Elapsed: 0.832241 msecs

$ baloosearch hello_balo
/home/nate/baloo_bug_378690
Elapsed: 0.256738 msecs
Comment 4 Munzir Taha 2017-10-28 11:46:00 UTC
Nate, you haven't tried my last string. The original example of hello_baloo or even hello_balo indeed works now. However nategraham_notnategraham example doesn't. Try and confirm.
Comment 5 Nate Graham 2017-10-28 21:37:19 UTC
Yup, I can confirm:

$ cat baloo_378690
nategraham_notnategraham
$ baloosearch nategraham_not
/home/nate/baloo_378690
Elapsed: 9.63419 msecs
$ baloosearch nategraham_no
Elapsed: 0.405514 msecs

Weird that it doesn't work with this particular string, but does work with another string that's truncated at the end. The much lower execution time suggests that it's not even looking, or something.
Comment 6 Munzir Taha 2017-10-28 23:34:52 UTC
Could it be a developer who believes on the unlucky 13? ;)
Comment 7 Christoph Feck 2017-11-08 19:56:14 UTC
I think there is a minimum of three characters (per word?) that baloo requires for searches.
Comment 8 Munzir Taha 2017-11-09 05:39:15 UTC
@feck:
If baloo is to consider nategraham_notnategraham as two distinct words, then when I do baloosearch nategraham_no, it should, at least, match the nategraham part which is > 3 characters
Comment 9 Igor Poboiko 2019-06-30 12:28:07 UTC
When expanding search query, terms which are connected with underscore (i.e query [nategraham_notnategraham]) are expanded as phrase query (equivalent to query ["nategraham notnategraham"]), meaning that these two words should match *exactly* and to be in this particular order.

Phrase searches by prefix (when the second term is not written completely, but just its prefix - like ["nategraham notnate"]) are unfortunately not yet implemented.

As a workaround I suggest replacing underscore with AND (i.e. just [nategraham AND notnate]) - in that case your document will match.
Comment 10 tagwerk19 2021-07-12 15:23:53 UTC
(In reply to Christoph Feck from comment #7)
> I think there is a minimum of three characters (per word?) that baloo
> requires for searches.
What is a little counterintuitive/confusing is that the limit is for three characters for file content, so you need:
    baloosearch nategraham not
but if you search for the filename you get results with just two characters
    baloosearch ba 378690