Bug 116334 - Fuzzy search
Summary: Fuzzy search
Status: RESOLVED UNMAINTAINED
Alias: None
Product: amarok
Classification: Applications
Component: general (show other bugs)
Version: 1.3.1
Platform: Ubuntu Linux
: NOR wishlist
Target Milestone: ---
Assignee: Amarok Developers
URL:
Keywords:
: 130337 154601 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-11-14 11:38 UTC by Volker
Modified: 2009-08-03 04:02 UTC (History)
5 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Volker 2005-11-14 11:38:55 UTC
Version:           1.3.1 (using KDE KDE 3.4.3)
Installed from:    Ubuntu Packages
OS:                Linux

Amarok is already a great application but I miss a outstanding feature included in 'YAMMI', another music manager application under Linux/QT/KDE. YAMMI has a fuzzy search that allows you to enter some characters in the search field and only with half the name of the band/title the song is found. Or if you misspelled something you still get a list of similar items which is very useful if your mp3s are not in best order and don't contain all correct tag information. I would really appreciate it seeing this feature in Amarok. As YAMMI (http://yammi.sourceforge.net) is also open source software you should be able to take some parts of this search to include it into Amarok or at least get an idea of what has to be implemented for a fuzzy search. In any case, thanks a lot for our efforts!
Comment 1 Mark Kretschmann 2005-12-09 22:17:48 UTC
*** Bug 118032 has been marked as a duplicate of this bug. ***
Comment 2 Oliver Nölle 2006-03-06 20:11:11 UTC
*** This bug has been confirmed by popular vote. ***
Comment 3 Seb Ruiz 2006-07-06 06:37:31 UTC
*** Bug 130337 has been marked as a duplicate of this bug. ***
Comment 4 Robert 2007-02-17 18:31:01 UTC
Oliver: maybe you could give a short explanation how you implemented this in Yammi?

I strongly second this feature request. This always was one of the cooles features in Yammi.
Comment 5 Robert 2007-02-17 18:35:24 UTC
Take a look at this screenshot ( http://yammi.sourceforge.net/pics/search.png ) to see how nicely this works in Yammi.
Comment 6 Robert Kaiser 2007-03-29 23:54:59 UTC
From glancing at filenames in Yammi source, I'd guess that http://yammi.cvs.sourceforge.net/yammi/kyammi/src/fuzzsrch.cpp?revision=1.1.1.1&view=markup is where the source can be found.

Unfortunately, maintenance of Yammi has officially been ended now, but the code is GPL as stated in the COPYING doc in the kyammi/ dir of its source tree, so it's legally no problem to reuse that code.

I hope it will be easy enough code-wise but from the Yammi end-of-life announcement, it looks like Oliver would be happy to help anyone who wants to port his code to amaroK (he ponders doing it himself if he has time, but given how much time he had for Yammi in the last two years, I wouldn't count on that).

I loved Yammi, and with that feature, I'd feel that I should really try amaroK ;-)
Comment 7 Robert 2007-03-30 00:24:49 UTC
What's a fundamental difference between yammi and amarok is that yammi keeps information about objects in its media library in its own data structure in memory, while amarok keeps it in an sql database. That means that amarok is probably limited to sql when it comes to querying song information. And in sql, doing a fuzzy search is impossible, as far as I can tell.

Am I right?

By the way, the documentation in yammi's fuzzsrch.h ( http://yammi.cvs.sourceforge.net/yammi/kyammi/src/fuzzsrch.h?revision=1.1.1.1&view=markup ) is in German. I could translate it if an amarok developer needs it in English.
Comment 8 Oliver Nölle 2007-03-30 09:04:21 UTC
Memory/sql: you are right, so my approach to bring fuzzy search was to get ALL searchable strings out of the database via an appropriate sql entry, and then work on these strings in memory (where you could easily reuse the algorithm from yammi). It does work, however the performance might suffer a bit (I think the search took 2-3 seconds on a 3000 song collection, but it wasn't optimized at all).

So I think it would be still possible, but on top of sql and not using any fuzzy sql statement...and performance might be an issue...
Comment 9 Mark Kretschmann 2007-03-30 09:09:56 UTC
Yes, performance would be the main problem. Some Amarok users have *very* big collections (e.g. 50,000 tracks), and then you're looking at a search time of > 10s, which isn't really acceptable any more.
Comment 10 Robert 2007-03-30 11:04:59 UTC
Anyway, this feature is just too nice to simply ignore it :) Would there be any other way to get this into amarok? Could amarok e.g. read the whole library from the sql database into memory when it is started, and keep it there in order to make fuzzy search possible?
Comment 11 Volker 2007-03-30 11:15:12 UTC
I do agree - even without proper sql support there should be a possibility to use this feature. Maybe set an option with a big annotation "fuzzy search not recommended for slow hardware/big collections etc."?
Comment 12 Marcel 2007-04-01 20:33:21 UTC
No fuzzy search in Amarok 2.0? :(
Comment 13 Robert 2007-07-06 17:10:30 UTC
By the way, something very important (to me) is affected by this:

Today I wanted use amarok's search to search for an artist named "Drøn". Of course, I don't have the "ø" on my keyboard. So what did I do? I tried to search for "dron", which didn't give me any results. I think this might be a problem for many people who listen to music from countries that use nonlatin characters (Icelandic, e.g. "Múm")? Or what about those poor people (e.g. American...) who don't even know that there exist characters other than the ASCII stuff they have on their keyboard?
Granted, this problem could be solved in other ways as well (like mapping ú -> u, ø to o), but it would also be solved by having fuzzy search.
Comment 14 Robert 2007-07-06 18:41:13 UTC
In response to comment #9 (search scalability):

I just did some "benchmarks" using the latest version of yammi and amarok 1.4.6 (with SQLite engine).

I have about 25000 files in my collection (about 150GB). My laptop has a 1.3GHz Pentium M (1st generation, 4 years old, single core) and a 5400 rpm drive.

Application startup time (measured from running application until the user interface has completely loaded and is usable):
yammi: 15s
amarok: 10s

Search time (measured from the point where something is pasted into the search input until the search results become available):
yammi: ca. 1s
amarok: ca. 3.5s

So it seems that yammi's search method is really not that bad... Of course it is a question whether yammi and amarok are really comparable in this simple manner, mostly because amarok seems to do a lot of snazzy stuff with the user interface. Then again, Oliver said that yammi's search is not really optimized at all, so it could even be speeded up. I also don't know whether 25000 files is enough to say something about scalability. Is there somebody who has a lot more files than that?
Comment 15 Robert 2007-07-06 18:49:23 UTC
By the way, for the benchmark (comment #14) I chose search words that gave me not too many results, as yammi returns at most the 200 best matches, whereas amarok returns all matches. My search words returned about 15-60 results in amarok (and always nearly or exactly 200 in yammi).
Comment 16 Dovydas 2007-07-06 20:46:43 UTC
Robert wrote:
> Granted, this problem could be solved in other ways as well (like mapping ú -> u, ø to o), but it would also be solved by having fuzzy search. 

Fuzzy search would solve problem of one accented char, but it would not solve problem if there are too many accented characters, or script is completely different (like cyrilic, greek, arabian or any other non latin). There is another request for enhancement for this issue: http://bugs.kde.org/show_bug.cgi?id=118032
Comment 17 Seb Ruiz 2007-12-25 22:03:28 UTC
*** Bug 154601 has been marked as a duplicate of this bug. ***
Comment 18 Pablo Montepagano 2008-06-12 18:17:39 UTC
I think that this bug has been partially solved. At least in my box now when I type "giran", I get those tracks which have "giran", AND "girán". Thanks for the upgrade! (I'm using 1.4.9.1)
Comment 19 Matt Rogers 2009-08-03 04:02:19 UTC
Thank you for taking the time to report this bug for Amarok. Amarok 1.4 is now unmaintained and will no longer see any improvements. Because of this, and the massive amount of changes Amarok has undergone throughout the 2.x series of releases, we are closing bugs that no longer apply to the 2.x series due to changes in functionality, the underlying architecture, or a conflict for the vision of Amarok 2.

We appreciate the time you took to provide feedback about Amarok 1.4 and will look forward to any feedback you may provided about Amarok 2. Thanks.