Bug 118032 - enhance search to match chars with accents and similar stuff
Summary: enhance search to match chars with accents and similar stuff
Status: RESOLVED FIXED
Alias: None
Product: amarok
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Fedora RPMs Linux
: NOR wishlist
Target Milestone: ---
Assignee: Amarok Developers
URL:
Keywords:
: 118034 142502 143758 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-12-09 21:38 UTC by Dovydas
Modified: 2009-08-03 11:37 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dovydas 2005-12-09 21:38:59 UTC
Version:            (using KDE KDE 3.4.3)
Installed from:    Fedora RPMs

I have a lot of songs with tags beeing in several languages. Not all my songs are perfectly tagged. Tags of the same artist are written in different ways. I have tags with artist name "Stephane Pompougnac" and with the same artist name written in french "Stéphane Pompougnac". If in the search field I start to type "Ste", I get results with "Stephane Pompougnac" only. If I start typing "Sté", I get results with "Stéphane Pompougnac" tags only.

I want that search for "Ste" would match both "Stephane Pompougnac" and "Stéphane Pompougnac".

And another example. I have songs with tags artist name spelled in english "Boris Grebenshchikov" and other tags with the same artist name in cyrillic "Борис Гребенщиков". I want to start typing "Boris" and I want both tags to match - "Boris Grebenshchikov" and "Борис Гребенщиков". If I type "Bjork", I want "Bjork" and "Björk" to be matched.

I want amaroK to match other "e" variants like ę, ė, е (cyrillic), э, é as well. For o I want to match ô, ö, о (cyrillic), ø and so on. I think it would be very useful for many of us.
Comment 1 Mark Kretschmann 2005-12-09 22:17:47 UTC

*** This bug has been marked as a duplicate of 116334 ***
Comment 2 Alexandre Oliveira 2005-12-09 22:23:16 UTC
*** Bug 118034 has been marked as a duplicate of this bug. ***
Comment 3 Dovydas 2005-12-10 13:16:44 UTC
This bug is slightly different from bug 116334. My wish was that certain characters match oter certain characters. Lets say e matches é and ę, but you can not miss chars when you type. If you type "Bjork", I want to match just "Bjork" and "Björk", but not "Bejork", "bjrk" or "bjorl". Fuzzy search, when you mistype or miss some chars, is different thing.
Comment 4 Isaiah Damron 2005-12-11 21:40:26 UTC
I agree, this is different from bug 116334.  Fuzzy searching would probably solve this problem assuming you don't have too many accented characters in the name, but if fuzzy searching is decided against, then this wish could still be implemented.
Comment 5 Dovydas 2005-12-11 22:15:35 UTC
Fuzzy search bug 116334 will not help in latin-cyrillic substitution. For artist name "Grebenschikov" and "Гребенщиков" all characters are “mistyped“, 11 or 13 mistyped chars is hard to believe to match. And for _enhanced_search_ both names matches, we just need to asume, that character pairs G-Г, R-Р, B-Б, N-Н matches each other.

In general we just need to define what char sequences matches, like
L(latin)-Ļ(latvian)-Ł(polish)-Л(cyrillic)-Λ(greek)
Α(greek)-A(latin)-А(cyryllic)-Ā(latvian)-ÄÅÆ(swedish)
and so on. I hope it is much easier to implement than fuzzy search.
Comment 6 Dovydas 2006-01-14 05:52:48 UTC
It seems that fuzzy search is already implemented, and international search is still decided against.
Comment 7 arvind 2006-09-10 00:18:34 UTC
have there been any updates on this? international search seems to still not work in 1.4.2 as of the moment, and it can rather frustrating to be able to search for 'bjork', but have to scroll down a huge collection to find 'björk'
Comment 8 richlv 2007-02-14 13:51:41 UTC
which database backend are you using ?
with mysql, diacritic symbols are considered the same as 'plain' ones (like s & š).

as for doing that for cyrillic, i doubt that would happen.
Comment 9 Dovydas 2007-02-14 14:43:50 UTC
I use sqlite so s-š are different. I didn't know this depends on database.

At the present for cyrillic even uppercase and lowercase of the same letter is considered as different letters. It would be very nice to have that fixed.
Comment 10 Mark Kretschmann 2007-02-14 14:56:54 UTC
Uppercase/lowercase is already fixed in 1.4.5.
Comment 11 shattered 2007-02-20 20:04:04 UTC
last.fm 'knows' proper artists' names (in their native language) -- if your J-Pop tracks are tagged in English, they will appear in Kanji on your last.fm playlist.  Maybe this information is available externally?
Comment 12 Kevin Funk 2007-04-02 23:14:33 UTC
*** Bug 143758 has been marked as a duplicate of this bug. ***
Comment 13 Seb Ruiz 2007-07-25 00:53:02 UTC
*** Bug 142502 has been marked as a duplicate of this bug. ***
Comment 14 Jeffrey 2008-01-10 02:20:47 UTC
*** This bug has been confirmed by popular vote. ***
Comment 15 Myriam Schweingruber 2009-08-03 11:37:33 UTC
This is available in Amarok 2.