Bug 333038

Summary: Search requires exact spelling of special characters
Product: [Unmaintained] Baloo Reporter: mau <b-misc>
Component: Baloo File DaemonAssignee: Vishesh Handa <me>
Status: RESOLVED FIXED    
Severity: wishlist CC: hrvoje.senjan, mutlu_inek, vanboxem.ruben
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Kubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description mau 2014-04-03 17:15:10 UTC
Consider a file with name "école", you can find it searching for "école" but not searching for "ecole", which might be a problem if users use both kind of spellings, ASCII and non-ASCII. Maybe it would be better to be less strict with regard to searching?

Reproducible: Always
Comment 1 Vishesh Handa 2014-04-04 08:13:57 UTC
Confirmed. Fixing this is not that simple, so it won't be done in time for 4.13.
Comment 2 Vishesh Handa 2014-07-24 16:16:53 UTC
Git commit 59318e9694c0847bcaa5e71a4fbadde877e7a33e by Vishesh Handa.
Committed on 23/07/2014 at 11:34.
Pushed by vhanda into branch 'frameworks'.

TermGenerator: Remove all diarectics from terms

We're effectively loosing some information, but it's probably for the
best as the user typically will search for words without the accents. We
can also expand the query parser to ignore diarectics as well.

M  +0    -1    src/xapian/autotests/termgeneratortest.cpp
M  +12   -1    src/xapian/termgenerator.cpp

http://commits.kde.org/baloo/59318e9694c0847bcaa5e71a4fbadde877e7a33e
Comment 3 Christoph Feck 2014-07-24 17:23:06 UTC
Would the fix affect bug 328763? What about other runners, in other words, would it make sense to remove diacritics in krunner (or whatever search tool is in Next)?
Comment 4 vanboxem.ruben 2014-09-05 09:51:43 UTC
I don't know how exactly the fix was implemented, but this should also encompass capital letters, and perhaps the "loss of precision" can be resolved by being strict wrt spelling if a search term is quoted or not?

I have a file named Blàbla.pdf

Searching for blabla should find this file.
Searching for "blabla" will not find this file.
Searching for "blàbla" may find this file, depending on what requirements the quoting will relax.

These should really be options for the desktop search section in systemsettings (which is extremely empty currently).