Summary: | Option to ignore diacritic in quick search | ||
---|---|---|---|
Product: | [Applications] kmail2 | Reporter: | Michal Hlavac <miso> |
Component: | search | Assignee: | kdepim bugs <kdepim-bugs> |
Status: | CONFIRMED --- | ||
Severity: | wishlist | CC: | alexandre.bonneau, arthur, kdenis, kollix, Martin, montel |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Unlisted Binaries | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | Testprogram for QChar::decomposition() |
Description
Michal Hlavac
2010-09-06 10:54:50 UTC
what means "diacritic" ? Regards There is a lot of diacritical characters in many languages (e.g. slovak). If somebody wrote email using these characters I need to know that mail was written with subject e.g. čťžýá. It's easier to search only for ctzya. For example thunderbird do this. this issue is relevant also for kmail2 How we can wrote these diacritical characters ? e.g. you can install slovak keyboard layout and write these letters by pressing number keys. That is valid for any accentuated characters like éèàùï in French. To type this, use `setxkbmap us_intl`, then you can compose accentuated characters by typing ' then e for instance. Just to make this clear: you want that kmail filters mails for "čťžýá" even if you type "ctzya" ? If so, any idea how a unicode character can be transformed into the equivalent character without the diacritical sign ? Look here: http://stackoverflow.com/questions/140422/how-do-i-translate-8bit-characters-into-7bit-characters-i-e-to-u http://stackoverflow.com/questions/144761/how-to-remove-accents-and-tilde-in-a-c-stdstring That's exactly what we would want indeed Martin. Ideally, as pointed in the articles poster in comment #9, searching for "elements" should return results like "Éléments", "ELEMENTS", "elements", etc. I do think quick filters like this which doesn't give the user the possibility to fine tune the search options should default to case insensitive and accent insensitive. For make it we must have a table with all type of diacritic for all language... So where we can find it ? And for each line in messagelist we must search diacritic and convert it... I don't think it's very optimal... I think I found a nice Qt solution: QChar::decomposition() It returns for all the chars of "öäüÖÄÜčťžýáéèàùï" a QString with length==2 and the first char of this is the char without the diacritical sign. See attached test program Created attachment 72564 [details]
Testprogram for QChar::decomposition()
Hmmm.. the problem is, that the search is done via Nepomuk, and Nepomuk would need to match the given pattern against the mails text or the mails text with exchanged (diacritic -> non-diacritic) characters in some way. No idea if this is possible at all. you are not add all diacritic. For example in french there is ë êô Ö Ô etc. and it's just french So for each subject we will have a big loop. And it's right we can't use it in nepomuk. So i am against to use it. And we will find guys which doesn't want a search which ignore diacritic. Sorry I will close it. I don't want to slow down kmail. Of course, you always find guys which doesn't want a search which ignore diacritic. But question is how many? This kind of search is default in thunderbird and I think it should be also in kmail. Current state of search is useless for me, because I don't remember who wrote e-mail with diacritic and who didn't. Ignoring the problem won't dismiss it really. Surely, as some other open-source email clients did, this kind of quick search/filter is mandatory. There must be a way to do that quickly, perhaps modifying Nepomuk so that it index diacritic content both ways ? As for slowing kmail, I'm sure users would prefer to have instant display of email when selecting them (which is absolutely not the case with kmail 4.8.2), than to have to wait a little bit when using quicksearch on mails (which you don't do every minutes right ?). Last but not least. This could be implemented at least for the quickfilter on top of the folder list. The number of directories is much more lower than of a huge maildir folder ; it shouldn't be noticeable. Please reopen that bug. I have to reopen this issue. Maybe there are some questions for nepomuk developer, but close issue without try is not solution for me. Hello Michal. Thank you for your report. Although I bet so, I will will ask: Can you reproduce this with Baloo based KMail from KDEPIM 4.14.10 or even Akonadi Search (renamed Baloo) based KMail from KDEPIM 15.08? Feel free to reopen if you can. Also I set this to wishlist since it is basically a feature request. Thank you and greetings from KDE Randa Meetings, Martin Not sure about 4.14.10, but 4.14.2 still does not support this. Thanks, Alexandre. Reopening as I think thats close enough. Yes, this feature occurs also in baloo. I am trying to contribute to baloo and add some support for specific languages, but I thing it takes long time. KMail 5.4.0 I cannot reproduce this in my setup. I have a bunch of messages in my Inbox that contain "Bestätigung". Filtering for "Bestätigung" lists only these messages (as it should), and filtering for "Besttigung" does not list any messages at all (as it should). Can anyone reproduce this bug in a more recent, Frameworks-based version of KMail? Please note that versions like 4.14.x (starting with "4.") have been unsupported for quite some time now. Using v5.2.3, this bug is still present. Having a message subject containing "Élévation", when you type 'elev' in the filter bar, this message is not shown. I couldn't found how to make the quickfilter on top of the folder list appears again though. > I cannot reproduce this in my setup. I have a bunch of messages in my Inbox that contain "Bestätigung". Filtering for "Bestätigung" lists only these messages (as it should) and filtering for "Besttigung" does not list any messages at all (as it should).
This issue is about ignoring accent in search. It means both your cases should list messages that contains "Bestätigung" and also "Besttigung".
So you already reproduce this issue.
I understand that this issue is little subjective, but e.g. Slovak language contains lot of diacritical marks and I really don't remember if mail was written with diacritical marks or without. (In reply to Michal Hlavac from comment #25) > This issue is about ignoring accent in search. I re-read the request, and you are of course right. I'm sorry, I didn't even realize that this is a feature request. "Bestatigung" does not turn up results and I couldn't find corresponding options in the preferences, so this request is still valid as of 5.4.0. I also appended "Option to" to the title of this request. |