Bug 250345

Summary: Option to ignore diacritic in quick search
Product: [Applications] kmail2 Reporter: Michal Hlavac <miso>
Component: searchAssignee: kdepim bugs <kdepim-bugs>
Status: CONFIRMED ---    
Severity: wishlist CC: alexandre.bonneau, arthur, kdenis, kollix, Martin, montel
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Unlisted Binaries   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Testprogram for QChar::decomposition()

Description Michal Hlavac 2010-09-06 10:54:50 UTC
Version:           unspecified (using KDE 4.5.0) 
OS:                Linux

Kmail doesn't ignore diacritic in quick search

Reproducible: Always
Comment 1 Laurent Montel 2010-09-22 19:08:10 UTC
what means "diacritic" ?
Regards
Comment 2 Michal Hlavac 2010-09-22 19:12:38 UTC
See http://en.wikipedia.org/wiki/Diacritic
Comment 3 Michal Hlavac 2010-09-22 19:18:04 UTC
There is a lot of diacritical characters in many languages (e.g. slovak). If somebody wrote email using these characters I need to know that mail was written with subject e.g. čťžýá. It's easier to search only for ctzya.
For example thunderbird do this.
Comment 4 Michal Hlavac 2011-08-23 08:52:53 UTC
this issue is relevant also for kmail2
Comment 5 Laurent Montel 2011-09-08 08:03:09 UTC
How we can wrote these diacritical characters ?
Comment 6 Michal Hlavac 2011-09-08 08:22:27 UTC
e.g. you can install slovak keyboard layout and write these letters by pressing number keys.
Comment 7 Alexandre Bonneau 2012-01-09 21:15:35 UTC
That is valid for any accentuated characters like éèàùï in French.

To type this, use `setxkbmap us_intl`, then you can compose accentuated characters by typing ' then e for instance.
Comment 8 Martin Koller 2012-07-15 17:23:26 UTC
Just to make this clear: you want that kmail filters mails for "čťžýá" even if you type "ctzya" ?
If so, any idea how a unicode character can be transformed into the equivalent character without the diacritical sign ?
Comment 10 Alexandre Bonneau 2012-07-16 14:54:14 UTC
That's exactly what we would want indeed Martin.
Ideally, as pointed in the articles poster in comment #9, searching for "elements" should return results like "Éléments", "ELEMENTS", "elements", etc.

I do think quick filters like this which doesn't give the user the possibility to fine tune the search options should default to case insensitive and accent insensitive.
Comment 11 Laurent Montel 2012-07-16 17:43:12 UTC
For make it we must have a table with all type of diacritic for all language...
So where we can find it ?
And for each line in messagelist we must search diacritic and convert it...
I don't think it's very optimal...
Comment 12 Martin Koller 2012-07-16 18:44:18 UTC
I think I found a nice Qt solution: QChar::decomposition()
It returns for all the chars of "öäüÖÄÜčťžýáéèàùï" a QString with length==2 and the first
char of this is the char without the diacritical sign.
See attached test program
Comment 13 Martin Koller 2012-07-16 18:45:30 UTC
Created attachment 72564 [details]
Testprogram for QChar::decomposition()
Comment 14 Martin Koller 2012-07-16 20:12:09 UTC
Hmmm.. the problem is, that the search is done via Nepomuk, and Nepomuk would need to match the given pattern against the mails text or the mails text with exchanged (diacritic -> non-diacritic) characters in some way.
No idea if this is possible at all.
Comment 15 Laurent Montel 2012-07-17 06:46:03 UTC
you are not add all diacritic. For example in french there is ë êô Ö Ô etc.
and it's just french
So for each subject we will have a big loop.
And it's right we can't use it in nepomuk.

So i am against to use it.

And we will find guys which doesn't want a search which ignore diacritic.

Sorry I will close it.
I don't want to slow down kmail.
Comment 16 Michal Hlavac 2012-07-17 08:39:27 UTC
Of course, you always find guys which doesn't want a search which ignore diacritic. But question is how many? This kind of search is default in thunderbird and I think it should be also in kmail.

Current state of search is useless for me, because I don't remember who wrote e-mail with diacritic and who didn't.
Comment 17 Alexandre Bonneau 2012-07-17 12:45:47 UTC
Ignoring the problem won't dismiss it really.

Surely, as some other open-source email clients did, this kind of quick search/filter is mandatory.
There must be a way to do that quickly, perhaps modifying Nepomuk so that it index diacritic content both ways ?

As for slowing kmail, I'm sure users would prefer to have instant display of email when selecting them (which is absolutely not the case with kmail 4.8.2), than to have to wait a little bit when using quicksearch on mails (which you don't do every minutes right ?).

Last but not least. This could be implemented at least for the quickfilter on top of the folder list. The number of directories is much more lower than of a huge maildir folder ; it shouldn't be noticeable.

Please reopen that bug.
Comment 18 Michal Hlavac 2012-08-06 05:46:43 UTC
I have to reopen this issue. Maybe there are some questions for nepomuk developer, but close issue without try is not solution for me.
Comment 19 Martin Steigerwald 2015-09-09 21:11:41 UTC
Hello Michal. Thank you for your report. Although I bet so, I will will ask: Can you reproduce this with Baloo based KMail from KDEPIM 4.14.10 or even Akonadi Search (renamed Baloo) based KMail from KDEPIM 15.08? Feel free to reopen if you can. Also I set this to wishlist since it is basically a feature request.

Thank you and greetings from KDE Randa Meetings,
Martin
Comment 20 Alexandre Bonneau 2015-09-10 11:11:53 UTC
Not sure about 4.14.10, but 4.14.2 still does not support this.
Comment 21 Martin Steigerwald 2015-09-10 12:11:41 UTC
Thanks, Alexandre. Reopening as I think thats close enough.
Comment 22 Michal Hlavac 2015-09-10 12:25:00 UTC
Yes, this feature occurs also in baloo. I am trying to contribute to baloo and add some support for specific languages, but I thing it takes long time.
Comment 23 Denis Kurz 2017-01-08 18:30:35 UTC
KMail 5.4.0

I cannot reproduce this in my setup. I have a bunch of messages in my Inbox that contain "Bestätigung". Filtering for "Bestätigung" lists only these messages (as it should), and filtering for "Besttigung" does not list any messages at all (as it should).

Can anyone reproduce this bug in a more recent, Frameworks-based version of KMail? Please note that versions like 4.14.x (starting with "4.") have been unsupported for quite some time now.
Comment 24 Alexandre Bonneau 2017-01-08 20:44:10 UTC
Using v5.2.3, this bug is still present.

Having a message subject containing "Élévation", when you type 'elev' in the filter bar, this message is not shown.

I couldn't found how to make the quickfilter on top of the folder list appears again though.
Comment 25 Michal Hlavac 2017-01-08 21:06:22 UTC
> I cannot reproduce this in my setup. I have a bunch of messages in my Inbox that contain "Bestätigung". Filtering for "Bestätigung" lists only these messages (as it should) and filtering for "Besttigung" does not list any messages at all (as it should).

This issue is about ignoring accent in search. It means both your cases should list messages that contains "Bestätigung" and also "Besttigung".
So you already reproduce this issue.
Comment 26 Michal Hlavac 2017-01-08 21:14:33 UTC
I understand that this issue is little subjective, but e.g. Slovak language contains lot of diacritical marks and I really don't remember if mail was written with diacritical marks or without.
Comment 27 Denis Kurz 2017-01-09 17:14:29 UTC
(In reply to Michal Hlavac from comment #25)
> This issue is about ignoring accent in search.

I re-read the request, and you are of course right. I'm sorry, I didn't even realize that this is a feature request.

"Bestatigung" does not turn up results and I couldn't find corresponding options in the preferences, so this request is still valid as of 5.4.0. I also appended "Option to" to the title of this request.