Version: 2.2.0 (using KDE 4.3.4) OS: Linux Installed from: Ubuntu Packages This is a follow-up to bug 200596. Since the tag charset autodetection has been shown to be unreliable, I propose a hybrid solution to the problem that would satisfy most of non-latin users. Suppose that in the "Edit Track Details" dialog, on the "Tags" panel, there's a menu button placed next to each editable text box. Clicking that button reveals a "recode from charset:" submenu. The submenu lists all kinds of encodings (like in Firefox's "View"->"Character Encoding" submenu). Choosing an encoding performs a recode operation on the corresponding tag value. After that, the user can close the dialog using "Save & Close" or "Cancel" depending on whether he is satisfied with the result or not. This way, you get the basic tag charset handling functionality (which is currently non existent in Amarok and most open source media players BTW). Note that you can gradually add more intelligence based on that: 1) By reusing the charset detector code (I suppose), you can supply a list of the most probable encodings based on the original bytes string and promote these encodings to the immediately highest submenu level - while all the other encodings would be buried deeper in a Firefox-style "More Encodings" sub-submenu. 2) I suppose you can plug in some neural network or a simple dynamic rules engine that would learn the user's previous encoding choices and promote the most probable encodings in the menu structure during following invocations. Not being a QT/KDE developer, I cannot assess how hard would that be but it sure sounds doable to me.
I am not sure this is still useful, since there were a lot of changes since 2.2.0, which seems to be the version you are using. I strongly suggest you upgrade to Amarok 2.2.2 and check again.
I'm running Amarok 2.2.2 and I have problems with Cyrillic tags in MP3s - e.g. try these: http://olo.org.pl/files/Acropolis_Demo/
Well, the only thing that seems to be correctly encoded is the file name, the tags seem to use a different one, I can't read the name tags nor the lyrics with neither eyed3, kid3 nor easytag, and all my system is UTF-8. Please check that you are using the same encoding everywhere, preferably UTF-8 or UTF-16
That is the problem this enhancement is intended to solve: not having to use a dedicated tool (like EasyTag) to recode the tags. In this case, the tags are encoded using Windows-1251. Knowing that, I'd like to be able to perform the operation from within Amarok.
Also, after recoding the tags to UTF-8 using EasyTag, they display fine for most MP3s, with the exception of the first one - for some reason Amarok 2.2.2 displays garbage in the title despite it being correct UTF-8 (e.g. the QuodLibet player displays the title correctly). Here's that MP3 with tag recoded to UTF-8: http://olo.org.pl/files/Acropolis_Demo/utf-8/
I can't reproduce this. I retagged myself the 4 tracks you linked to earlier to UTF-8, using easytag, all my system is in UTF-8. After an update and an Amarok restart the tags show the characters correctly. Using Amarok 2.2.3-git (the development build of a few minutes ago), Kubuntu 9.10, KDE SC 4.4 RC1. As for your proposition: this should go to either a separate wish or to the mailing list amarok@kde.org, but keep in mind that Amarok is first of all a music player, it's highly unlikely it will become a mass tagger, there are enough tools for that available already. Please check your LOCALE settings, those need to be *all* set to UTF-8.
(In reply to comment #5) > Also, after recoding the tags to UTF-8 using EasyTag, they display fine for > most MP3s, with the exception of the first one - for some reason Amarok 2.2.2 > displays garbage in the title despite it being correct UTF-8 (e.g. the > QuodLibet player displays the title correctly). Make sure the charset detector is turned off. Settings -> Collection. It's off by default in 2.2.2 but if you toggled it on it could definitely cause that problem (which is why it's now off by default).
All locale env vars are set to UTF-8 variants AFAIR (cannot check right now, it's a home machine). I've also specifically verified that the charset detector had been turned off before testing. Also, the problem is with the one specific file - others are displayed correctly. That's why I've uploaded it to http://olo.org.pl/files/Acropolis_Demo/utf-8/ .
> As for your proposition: this should go to either a separate wish or to the > mailing list amarok@kde.org, This bug IS a separate wish. It even has Severity: wishlist. I don't understand why it has been marked as RESOLVED/FIXED... The problems related to encoding mentioned in comment #2 are purely a digression, and is seems that I should have kept them to myself since because of them the discussion has drifted away from the actual subject.
Created attachment 40009 [details] Pic But you're using the comments in #2 as proof of why this feature is needed, except that the problem is something local to your machine. For me, both the original file posted and the utf-8 versions have exactly the same result: what's in the picture attached.