118032 – enhance search to match chars with accents and similar stuff

Bug 118032 - enhance search to match chars with accents and similar stuff

Summary: enhance search to match chars with accents and similar stuff

Status:	RESOLVED FIXED

Alias:	None

Product:	amarok
Classification:	Applications
Component:	general (show other bugs)
Version:	unspecified
Platform:	Fedora RPMs Linux

Importance:	NOR wishlist
Target Milestone:	---
Assignee:	Amarok Developers

URL:
Keywords:

Duplicates (3):	118034 142502 143758 (view as bug list)
Depends on:
Blocks:

Reported:	2005-12-09 21:38 UTC by Dovydas
Modified:	2009-08-03 11:37 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed In:

Attachments
Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Dovydas 2005-12-09 21:38:59 UTC

Version:            (using KDE KDE 3.4.3)
Installed from:    Fedora RPMs

I have a lot of songs with tags beeing in several languages. Not all my songs are perfectly tagged. Tags of the same artist are written in different ways. I have tags with artist name "Stephane Pompougnac" and with the same artist name written in french "Stéphane Pompougnac". If in the search field I start to type "Ste", I get results with "Stephane Pompougnac" only. If I start typing "Sté", I get results with "Stéphane Pompougnac" tags only.

I want that search for "Ste" would match both "Stephane Pompougnac" and "Stéphane Pompougnac".

And another example. I have songs with tags artist name spelled in english "Boris Grebenshchikov" and other tags with the same artist name in cyrillic "Борис Гребенщиков". I want to start typing "Boris" and I want both tags to match - "Boris Grebenshchikov" and "Борис Гребенщиков". If I type "Bjork", I want "Bjork" and "Björk" to be matched.

I want amaroK to match other "e" variants like ę, ė, е (cyrillic), э, é as well. For o I want to match ô, ö, о (cyrillic), ø and so on. I think it would be very useful for many of us.

Comment 1 Mark Kretschmann 2005-12-09 22:17:47 UTC


*** This bug has been marked as a duplicate of 116334 ***

Comment 2 Alexandre Oliveira 2005-12-09 22:23:16 UTC

*** Bug 118034 has been marked as a duplicate of this bug. ***

Comment 3 Dovydas 2005-12-10 13:16:44 UTC

This bug is slightly different from bug 116334. My wish was that certain characters match oter certain characters. Lets say e matches é and ę, but you can not miss chars when you type. If you type "Bjork", I want to match just "Bjork" and "Björk", but not "Bejork", "bjrk" or "bjorl". Fuzzy search, when you mistype or miss some chars, is different thing.

Comment 4 Isaiah Damron 2005-12-11 21:40:26 UTC

I agree, this is different from bug 116334.  Fuzzy searching would probably solve this problem assuming you don't have too many accented characters in the name, but if fuzzy searching is decided against, then this wish could still be implemented.

Comment 5 Dovydas 2005-12-11 22:15:35 UTC

Fuzzy search bug 116334 will not help in latin-cyrillic substitution. For artist name "Grebenschikov" and "Гребенщиков" all characters are “mistyped“, 11 or 13 mistyped chars is hard to believe to match. And for _enhanced_search_ both names matches, we just need to asume, that character pairs G-Г, R-Р, B-Б, N-Н matches each other.

In general we just need to define what char sequences matches, like
L(latin)-Ļ(latvian)-Ł(polish)-Л(cyrillic)-Λ(greek)
Α(greek)-A(latin)-А(cyryllic)-Ā(latvian)-ÄÅÆ(swedish)
and so on. I hope it is much easier to implement than fuzzy search.

Comment 6 Dovydas 2006-01-14 05:52:48 UTC

It seems that fuzzy search is already implemented, and international search is still decided against.

Comment 7 arvind 2006-09-10 00:18:34 UTC

have there been any updates on this? international search seems to still not work in 1.4.2 as of the moment, and it can rather frustrating to be able to search for 'bjork', but have to scroll down a huge collection to find 'björk'

Comment 8 richlv 2007-02-14 13:51:41 UTC

which database backend are you using ?
with mysql, diacritic symbols are considered the same as 'plain' ones (like s & š).

as for doing that for cyrillic, i doubt that would happen.

Comment 9 Dovydas 2007-02-14 14:43:50 UTC

I use sqlite so s-š are different. I didn't know this depends on database.

At the present for cyrillic even uppercase and lowercase of the same letter is considered as different letters. It would be very nice to have that fixed.

Comment 10 Mark Kretschmann 2007-02-14 14:56:54 UTC

Uppercase/lowercase is already fixed in 1.4.5.

Comment 11 shattered 2007-02-20 20:04:04 UTC

last.fm 'knows' proper artists' names (in their native language) -- if your J-Pop tracks are tagged in English, they will appear in Kanji on your last.fm playlist.  Maybe this information is available externally?

Comment 12 Kevin Funk 2007-04-02 23:14:33 UTC

*** Bug 143758 has been marked as a duplicate of this bug. ***

Comment 13 Seb Ruiz 2007-07-25 00:53:02 UTC

*** Bug 142502 has been marked as a duplicate of this bug. ***

Comment 14 Jeffrey 2008-01-10 02:20:47 UTC

*** This bug has been confirmed by popular vote. ***

Comment 15 Myriam Schweingruber 2009-08-03 11:37:33 UTC

This is available in Amarok 2.