Bug 505968

Summary: Search with one or two characters is broken
Product: [Plasma] krunner Reporter: Guo Yunhe <i>
Component: generalAssignee: Plasma Bugs List <plasma-bugs-null>
Status: RESOLVED DUPLICATE    
Severity: normal CC: alexander.lohnau, natalie_clarius, nate, tagwerk19
Priority: NOR    
Version First Reported In: 6.4.0   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: screenshot of Kickoff searching CV

Description Guo Yunhe 2025-06-22 13:35:36 UTC
SUMMARY
In many cases, we use short names for files, like CV for curriculum vitae. And for Chinese, Japanese and Korean users, people and organizations are often named one or two characters, for example, my first name is 云鹤, just two chars. But baloo search doesn't support less than 3 char.

STEPS TO REPRODUCE

1. Create a file CV.odt
2. Wait for baloo to index this file
3. Search CV

OBSERVED RESULT
No result

EXPECTED RESULT
Show show CV.odt

SOFTWARE/OS VERSIONS
Operating System: openSUSE Tumbleweed 20250618
KDE Plasma Version: 6.4.0
KDE Frameworks Version: 6.15.0
Qt Version: 6.9.1
Kernel Version: 6.15.2-1-default (64-bit)
Graphics Platform: Wayland
Processors: 12 × AMD Ryzen 5 5600X 6-Core Processor
Memory: 32 GiB of RAM (31.3 GiB usable)
Graphics Processor: AMD Radeon RX 6700
Manufacturer: Micro-Star International Co., Ltd.
Product Name: MS-7C94
System Version: 1.0

ADDITIONAL INFORMATION
Comment 1 tagwerk19 2025-06-22 18:55:25 UTC
(In reply to Guo Yunhe from comment #0)
> SUMMARY
> ... But baloo search doesn't support less than 3 char ...
> 
> STEPS TO REPRODUCE
> 
> 1. Create a file CV.odt
> 2. Wait for baloo to index this file
> 3. Search CV
Does this test case really fail for you? It works for me but there are certainly nuances...

https://bugs.kde.org/show_bug.cgi?id=463830#c2 tries to summarise, scroll down to point "4..."

I don't see an issue when searching for short filenames, I see that you have to type three characters to find matches when doing a content search. Except a short search string, such as "cv", will find an exact match.

I can see that things might be different with CJK characters. It would be interesting to know...
Comment 2 tagwerk19 2025-06-23 07:54:31 UTC
(In reply to tagwerk19 from comment #1)
> ... It would be interesting to know...


I'm testing this on a Neon Unstable system...

What I tried was:

    $ echo 云鹤 > 云鹤.txt
    $ cat 云鹤.txt
    云鹤
    $ balooshow6 -x 云鹤.txt
    140087ed0da2dd 3977093853 1310855 云鹤.txt [/home/test/Documents/云鹤.txt]
            Mtime: 1750624009 2025-06-22T22:26:49
            Ctime: 1750624009 2025-06-22T22:26:49
            Cached properties:
                    Line Count: 1

    Internal Info
    File Name Terms: Ftxt F云鹤
    XAttr Terms:
    Plain Text Terms: 浜戦工
    Property Terms: Mplain Mtext T5 T8 X20-1
    lineCount: 1

What's interesting is that the filename terms and the plain text terms, the "file content", differ.

A baloosearch finds the filename match but not the content match:

    $ baloosearch6 filename:云鹤
    /home/test/Documents/云鹤.txt
    Elapsed: 0.32654 msecs

    $ baloosearch6 content:云鹤
    Elapsed: 0.339796 msecs

A content search for the term shown by balooshow, works:

    $ baloosearch6 content:浜戦工
    /home/test/Documents/云鹤.txt

I'm using the command line tools to make sure we are testing *just* the Baloo logic.

So something is wrong, 云鹤 is being indexed as 浜戦工 (when indexing the file content). Sorry, I cannot say whether this makes any sort of sense.
Comment 3 Guo Yunhe 2025-06-23 11:02:49 UTC
Created attachment 182555 [details]
screenshot of Kickoff searching CV

(In reply to tagwerk19 from comment #1)
> (In reply to Guo Yunhe from comment #0)
> > SUMMARY
> > ... But baloo search doesn't support less than 3 char ...
> > 
> > STEPS TO REPRODUCE
> > 
> > 1. Create a file CV.odt
> > 2. Wait for baloo to index this file
> > 3. Search CV
> Does this test case really fail for you? It works for me but there are
> certainly nuances...
> 
> https://bugs.kde.org/show_bug.cgi?id=463830#c2 tries to summarise, scroll
> down to point "4..."
> 
> I don't see an issue when searching for short filenames, I see that you have
> to type three characters to find matches when doing a content search. Except
> a short search string, such as "cv", will find an exact match.
> 
> I can see that things might be different with CJK characters. It would be
> interesting to know...

It is true that run command "baloosearch6 CV" can find CV.odt. But I was using Krunner (Alt+Space) and Kickoff launcher (Win), and they show nothing when I search with CV.

What works:
- baloosearch6 command
- Dolphin file search

What not:
- Kickoff search
- Krunner search
Comment 4 tagwerk19 2025-06-23 13:41:46 UTC
(In reply to Guo Yunhe from comment #3)
> What not:
> - Kickoff search
> - Krunner search
Not sure whether I'm doing the right thing here, but I'm changing the product to frameworks-krunner.
Nevertheless (in reply to tagwerk19 from comment #2)
> So something is wrong, 云鹤 is being indexed as 浜戦工 (when indexing the file
> content). Sorry, I cannot say whether this makes any sort of sense.
I'd be interested if this makes sense to you...
Comment 5 Guo Yunhe 2025-06-24 10:54:11 UTC
(In reply to tagwerk19 from comment #4)
> Nevertheless (in reply to tagwerk19 from comment #2)
> > So something is wrong, 云鹤 is being indexed as 浜戦工 (when indexing the file
> > content). Sorry, I cannot say whether this makes any sort of sense.
> I'd be interested if this makes sense to you...

I cannot confirm this because I disabled file content indexing to save my storage space...
Comment 6 Nate Graham 2025-06-24 19:22:39 UTC

*** This bug has been marked as a duplicate of bug 490972 ***