Sometimes files have long filenames consisting of 25+ characters without spaces. Baloo fails to find these files when searched by the file's full name. It works when baloo is disabled. STEPS TO REPRODUCE 1. Generate a file with a 25+ character full name. abcdefghijklmnopqrstuvwxyz OBSERVED RESULT 2. With baloo enabled, search for "abcdefghijklmnopqrstuvwxyz". Terminal: baloosearch abcdefghijklmnopqrstuvwxyz / Dolphin: Dolphin>Find abcdefghijklmnopqrstuvwxyz 3. File not found. 4. Search for a <25 string "abcd", "abcdefg", "abcdefghijklmnopq". Baloo finds the file. EXPECTED RESULT Baloo finds the file when searched by its full filename. SOFTWARE/OS VERSIONS Linux/KDE Plasma: Kubuntu 18.04, 19.04, KDE Neon 20190919-1119, Debian Buster KDE (available in About System) KDE Plasma Version: 5.12.8 KDE Frameworks Version: 5.44.0 Qt Version: 5.9.5 ADDITIONAL INFORMATION If the filename is split with spaces: "abcdefgh ijklmnopqrs tuvwxyz" the file can be found by its full filename.
Created attachment 138962 [details] Dolphin Search - with 24 character search string
Created attachment 138963 [details] Dolphin Search - with 25 character search string
Well I never... Make sure baloo is running... $ balooctl status $ echo "Hello Penguin" > abcdefghijklmnopqrstuvwxyz $ baloosearch abcdefghijklmnopqrstuvwxy /home/xxxx/abcdefghijklmnopqrstuvwxyz So, baloo is fine... Run Dolphin, Ctrl-F and type abcdefghijklmnopqrstuvwxy You see a file match Add a 'z' and... The match disappears... See the attachments, flagging as Confirmed... This is with: Neon Unstable Plasma: 5.22.80 Frameworks: 5.83.0 Qt: 5.15.3
(In reply to tagwerk19 from comment #3) > So, baloo is fine... Whups, didn't finish the test: $baloosearch abcdefghijklmnopqrstuvwxy /home/xxxx/abcdefghijklmnopqrstuvwxyz Elapsed: 0.258107 msecs $baloosearch abcdefghijklmnopqrstuvwxyz Elapsed: 0.207821 msecs So, baloo rather than dolphin: $balooshow -x abcdefghijklmnopqrstuvwxyz 143dd40000fc01 64513 1326548 abcdefghijklmnopqrstuvwxyz [/home/xxxx /abcdefghijklmnopqrstuvwxyz] Mtime: 1622665708 2021-06-02T22:28:28 Ctime: 1622665708 2021-06-02T22:28:28 Cached properties: Line Count: 1 Internal Info Terms: Mplain Mtext T5 T8 X20-1 hello penguin File Name Terms: Fabcdefghijklmnopqrstuvwxy XAttr Terms: lineCount: 1
Baloo currently handles term truncation only for "equals" queries, not "contains" queries (the default). $> baloosearch filename=abcdefghijklmnopqrstuvwxyz returns the file, while the following does not: $> baloosearch filename:abcdefghijklmnopqrstuvwxyz $> baloosearch abcdefghijklmnopqrstuvwxyz is intenally expanded to: $> baloosearch content:abcdefghijklmnopqrstuvwxyz OR filename:abcdefghijklmnopqrstuvwxyz
A possibly relevant merge request was started @ https://invent.kde.org/frameworks/baloo/-/merge_requests/158
Git commit b7c8ce1a999225f0362b8be274a9d5c786c3edda by Stefan Brüns. Committed on 06/07/2023 at 19:21. Pushed by bruns into branch 'master'. [SearchStore] Always use TermGenerator instead of QueryParser The QueryParser handles two fairly distinct tasks, parsing of quoting characters, and splitting of phrases into terms. The Phrase/Term splitting is similar to the TermGenerator, but slightly different. Using a different implementation for searching and DB storage can cause matching errors. While the nested QueryParser quoting /can/ be used, it is fairly redundant, and problematic: - Quoting is already handled by the AdvancedQueryParser, which always sits in front of the SearchStore. - The QueryParser is *only* used for "contains" queries (e.g. filename:foo.png) not "equal" queries ("filename=foo.png"). - Quoting of phrases for both variants is different, content:\"\'a b\'\" vs. content=\"a \"b". - The QueryParser does not handle term truncation (see bug reference). Use the TermGenerator in all cases, so term splitting and quoting is uniform. M +0 -1 autotests/integration/querytest.cpp M +7 -3 src/lib/searchstore.cpp https://invent.kde.org/frameworks/baloo/-/commit/b7c8ce1a999225f0362b8be274a9d5c786c3edda
Git commit c85de29f33224e27f273f66fef09837d24fdfd2c by Stefan Brüns. Committed on 06/07/2023 at 22:39. Pushed by bruns into branch 'kf5_test'. [SearchStore] Always use TermGenerator instead of QueryParser The QueryParser handles two fairly distinct tasks, parsing of quoting characters, and splitting of phrases into terms. The Phrase/Term splitting is similar to the TermGenerator, but slightly different. Using a different implementation for searching and DB storage can cause matching errors. While the nested QueryParser quoting /can/ be used, it is fairly redundant, and problematic: - Quoting is already handled by the AdvancedQueryParser, which always sits in front of the SearchStore. - The QueryParser is *only* used for "contains" queries (e.g. filename:foo.png) not "equal" queries ("filename=foo.png"). - Quoting of phrases for both variants is different, content:\"\'a b\'\" vs. content=\"a \"b". - The QueryParser does not handle term truncation (see bug reference). Use the TermGenerator in all cases, so term splitting and quoting is uniform. M +0 -1 autotests/integration/querytest.cpp M +7 -3 src/lib/searchstore.cpp https://invent.kde.org/frameworks/baloo/-/commit/c85de29f33224e27f273f66fef09837d24fdfd2c
Git commit af0b611bced29e6cc00f120e9ff69470bd657a7d by Stefan Brüns. Committed on 13/11/2023 at 21:41. Pushed by bruns into branch 'kf5'. [SearchStore] Always use TermGenerator instead of QueryParser The QueryParser handles two fairly distinct tasks, parsing of quoting characters, and splitting of phrases into terms. The Phrase/Term splitting is similar to the TermGenerator, but slightly different. Using a different implementation for searching and DB storage can cause matching errors. While the nested QueryParser quoting /can/ be used, it is fairly redundant, and problematic: - Quoting is already handled by the AdvancedQueryParser, which always sits in front of the SearchStore. - The QueryParser is *only* used for "contains" queries (e.g. filename:foo.png) not "equal" queries ("filename=foo.png"). - Quoting of phrases for both variants is different, content:\"\'a b\'\" vs. content=\"a \"b". - The QueryParser does not handle term truncation (see bug reference). Use the TermGenerator in all cases, so term splitting and quoting is uniform. (cherry picked from commit b7c8ce1a999225f0362b8be274a9d5c786c3edda) M +0 -1 autotests/integration/querytest.cpp M +7 -3 src/lib/searchstore.cpp https://invent.kde.org/frameworks/baloo/-/commit/af0b611bced29e6cc00f120e9ff69470bd657a7d