324617 – KDE text search (ex: filtering lists, krunner ) to enhance diacritics matching

Bug 324617 - KDE text search (ex: filtering lists, krunner ) to enhance diacritics matching

Summary: KDE text search (ex: filtering lists, krunner ) to enhance diacritics matching

Status:	CONFIRMED

Alias:	None

Product:	krunner
Classification:	Plasma
Component:	filesearch (show other bugs)
Version:	5.19.2
Platform:	Arch Linux Linux

Importance:	NOR wishlist
Target Milestone:	---
Assignee:	baloo-bugs-null

URL:
Keywords:	usability

Duplicates (4):	316077 328763 414689 426017 (view as bug list)
Depends on:
Blocks:

Reported:	2013-09-07 13:06 UTC by Radek Koníček
Modified:	2021-06-07 11:58 UTC (History)
CC List:	9 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Radek Koníček 2013-09-07 13:06:30 UTC

I have an idea that search dialogs such as krunner or any filter fields across kde would support enhanced diacritics matching. 
So that non-diacritic input would match all diacritic variants of text.

Reproducible: Always

Steps to Reproduce:
1.have KDE in languge whose alphabet contains diacritics (ENGLISH has not it! :|) like Slavic languages.
open any text filter dialog like System settings OR krunner.
2. you are about to find "systém" OR "system" string.
3. you type "system"
Actual Results:  
only "system" string matches

Expected Results:  
both "system", "systém" and other variants (ěë) would match.

This feature exists in Wind0ws 7.

Comment 1 Lukas Kucharczyk 2020-06-09 11:04:34 UTC

This issue still exists and it is confusing because no other platform I've used differentiates between something like system and systém. Is there any reason why it shouldn't be as easy to solve as stripping the diacritics from words by converting it to ASCII?

Comment 2 Lukas Kucharczyk 2020-07-04 07:53:33 UTC

Sorry for my previous comment, it sounded too confrontational.

Comment 3 Lukas Kucharczyk 2020-07-04 08:09:55 UTC

Tested it just now and both "system" and "systém" work. But for example pisma (== "fonts" in Czech) doesn't work but search for "font" finds "Písma" so I'm thinking in some cases it is the case where diacritics are accounted for and in other cases it doesn't find keywords.

Comment 4 Alexander Lohnau 2020-07-10 20:25:32 UTC

*** Bug 316077 has been marked as a duplicate of this bug. ***

Comment 5 Alexander Lohnau 2020-07-11 15:26:58 UTC

*** Bug 414689 has been marked as a duplicate of this bug. ***

Comment 6 Alexander Lohnau 2020-07-12 06:27:02 UTC

*** Bug 328763 has been marked as a duplicate of this bug. ***

Comment 7 Alexander Lohnau 2020-07-12 06:28:15 UTC

A similar patch has been made to baloo quite some time ago:
https://invent.kde.org/frameworks/baloo/commit/59318e9694c0847bcaa5e71a4fbadde877e7a33e

Comment 8 Alexander Lohnau 2020-08-31 15:57:22 UTC

*** Bug 426017 has been marked as a duplicate of this bug. ***

Comment 9 Alexander Lohnau 2020-10-31 09:16:08 UTC

I am not sure how this should be implemented, should the diacritics be removed like in the baloo patch or should we make sure that we check both the stripped and normal variant for matches?

Maybe a user whose languages actually uses diacritics can comment :)

Comment 10 veggero 2020-10-31 09:52:39 UTC

Hi! I think it is necessary to also search a non-stripped version. This is because the meaning can change leading to different results especially in file searches. As an example, italian "e" translates to "and", while "è" translates to "is".

Comment 11 Lukas Kucharczyk 2020-10-31 10:15:55 UTC

I agree with the above comment. Both need to be searched.