Bug 465723 - Search should be diacritic insensitive while results should prefer them when available.
Summary: Search should be diacritic insensitive while results should prefer them when ...
Status: CONFIRMED
Alias: None
Product: krunner
Classification: Plasma
Component: general (show other bugs)
Version: 5.27.0
Platform: Neon Linux
: NOR normal
Target Milestone: ---
Assignee: Plasma Bugs List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-02-14 17:35 UTC by Eridani Rodríguez
Modified: 2023-02-15 14:01 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eridani Rodríguez 2023-02-14 17:35:16 UTC
SUMMARY
1. It is usual for search engines (from Google to Discover app search) to not care if people typed a word with its diacritic signs or without them, returning equal results no matter how the word was written. This is not the case for Krunner, as it returns different results.

As an example, in Spanish, the correct way to spell “Japan” in a written text would be “Japón” but typing it correctly involves pressing a [modifier key] + [accent key] + [O] key to obtain “ó”. That is quite complex for a quick search, so people usually just presses the “O” key to type “japon”, not caring even for caps, leaving the rest to the search engines.

2. However, in some cases diacritics change the meaning of a word by 180º, so if people have typed them, the results should prefer matches with them, while still showing the results that do not have them because it is common to mistype them in some cases.

As an example, in Spanish “inglés” (English) can't be more distinct from “ingles” (groins), the difference residing only in the accent on top of the letter “e”.


STEPS TO REPRODUCE
1. Launch KRunner using Spanish locale*** and search for:
A) “hora japon”
B) “hora japón”

OBSERVED RESULT
Case (A) returns nothing, while case (B) properly returns Japan's date and time.

EXPECTED RESULT
Both searches should return the matches from case (A).
Additionally, search (B) should also return its own matches and show them on top.

SOFTWARE/OS VERSIONS
Operating System: KDE neon 5.27
KDE Plasma Version: 5.27.0
KDE Frameworks Version: 5.102.0
Qt Version: 5.15.8
Kernel Version: 5.15.0-60-generic (64-bit)
Graphics Platform: X11


ADDITIONAL INFORMATION
*** I use Plasma in Spanish, hence is the case I can talk about, but there is a related bug for Arabic already open, so I can imagine how this may apply to other languages as well.
https://bugs.kde.org/show_bug.cgi?id=465333

Is this even a Krunner bug?, there are many components affected with a similar issue, see:
https://bugs.kde.org/show_bug.cgi?id=250345
https://bugs.kde.org/show_bug.cgi?id=274933
https://bugs.kde.org/show_bug.cgi?id=429448
Comment 1 David Edmundson 2023-02-15 13:56:05 UTC
We won't be able to do this in one place and fix all cases where search terms are compared, we'll need to change each individual search source.

Relevant code is `QString::normalize`, we'll need to call this on both our query term and everything option we're comparing against. 

We'll target the service and datetime runners, then you'll need to let us know if it's affecting other places.