Bug 127643

Summary: Amarok 1.4.0 has not cyrillic support
Product: [Applications] amarok Reporter: plamen <plspace>
Component: generalAssignee: Amarok Developers <amarok-bugs-dist>
Status: CLOSED NOT A BUG    
Severity: normal CC: hsanson, kde-bugs, mkubicki, myriam, roman.cheplyaka, web.hijacker
Priority: NOR    
Version: 1.4.0   
Target Milestone: ---   
Platform: Slackware   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description plamen 2006-05-19 09:20:56 UTC
Version:           1.4.0 (using KDE KDE 3.5.0)
Installed from:    Slackware Packages
OS:                Linux

In version 1.3.x of Amarok there was an option for using CP-1251 coding for tags an so on. I can't find it in the new 1.4.0 version. So when I am playing songs with cyrillic ID3 tags the text is unreadable. 
How can I say Amarok to use CP-1251 for coding ?
Comment 1 Mark Kretschmann 2006-05-19 13:15:08 UTC
Tag recoding was removed. Instead, you're supposed to convert your tags to ID3-V2, which supports UTF-8. There are tools for automating this, e.g. EasyTag.
Comment 2 Alexandre Oliveira 2006-05-21 16:40:12 UTC
*** Bug 127744 has been marked as a duplicate of this bug. ***
Comment 3 Alexandre Oliveira 2006-05-21 16:40:22 UTC
*** Bug 127767 has been marked as a duplicate of this bug. ***
Comment 4 Vasileios P. Lourdas 2006-05-21 16:45:28 UTC
So, we don't hope this feature will come back in Amarok 1.4, right?
Comment 5 Roman Cheplyaka 2006-05-21 17:00:10 UTC
It's really inconvenient. Lots of people prefer to keep their music collections on CD, and you suggest them to rewrite all that dozens of discs to stay with amaroK..

Can you tell the reasons which forced you to remove this feature?
Comment 6 Ian Monroe 2006-05-21 18:14:56 UTC
It would do weird things like recode id3v2 into id3v1, which would truncate fields, users would turn it on without knowing what it was and get garbage tags, etc.
Comment 7 Shane King 2006-05-22 09:07:52 UTC
*** Bug 127802 has been marked as a duplicate of this bug. ***
Comment 8 Horacio Sanson 2006-05-22 09:57:23 UTC
Are there any other tips on how to convert my id3 tags to work with amarok?? easytag does no seem to understand my tags at all (I see only garbage) and I cannot see any convert options or the like.

All my tags are in UTF-8 and id3info displays them correctly... is there an easy way to make them work with amarok 1.4.0 or am I doomed to stay on the 1.3.x branch?
Comment 9 Roman Cheplyaka 2006-05-22 14:56:07 UTC
This is offtopic, but since this probably will interest almost everyone who is interested in this "bug", I think we can discuss it here.
To talk about easytag, you can specify encoding of your id3v1 tags in preferences. But I can't figure out, how to make it convert between id3v1 and id3v2.

I've found some CLI programs (named "id3v2" and "id3convert") which can convert between v1 and v2, but none of them seem to respect tag's encoding.

Perhaps I will write my own tool to do such conversion, but I have no time at the moment.
Comment 10 Florian Dittmer 2006-05-23 17:51:16 UTC
I agree with comment #5. Why remove such a handy feature? I really love amarok, but I guess I cannot upgrade to the new version this way...
Comment 11 Victor Semizarov 2006-05-24 09:29:22 UTC
CD collections pose half a trouble...
A lot of localized car MP3 players distributed here in Russia
do not understand UTF-8, and I'm a proud user of such a product
from JVC. Removing CP-1251 support from AmaroK breaks compatibility
with similar stuff and forces me to convert files each time I burn
a CD for use in car from files kept in the Amarok collection.
Your change is incorrect by nature.
Comment 12 Horacio Sanson 2006-05-24 09:39:32 UTC
What I don't get is that the idea of using id3v2 is that it defaults to UTF-8. As I said in my bug report all my tags are in UTF-8 and still I don't get them displayed correctly. This means Amarok reads the tags as UTF-8 only if they are id3v2 tags?
Comment 13 Michal Kubicki 2006-05-24 10:25:59 UTC
"Tag recoding was removed. Instead, you're supposed to convert your tags to ID3-V2, which supports UTF-8. There are tools for automating this, e.g. EasyTag."

I hope It's a joke. That was the most usefull option in Amarok.
Comment 14 Shane King 2006-05-24 10:40:48 UTC
Yes, only id3v2 tags are read as UTF-8 (well, technically, they're read as what encoding they're specified as, one of which is UTF-8). id3v1 tags are supposed to be in ISO-8859-1 encoding, even though a lot of tagging software ignores this and just jams in whatever encoding it feels like. That's pretty crappy since the tag format doesn't allow you to specify which encoding you're actually using, so any reader has to guess.

So really, amaroK is just following the spec, which says id3v1 tags should be ISO-8859-1. IMO any tag writer that writes non ISO-8859-1 data to id3v1 tags is the program with the bug, not amaroK.
Comment 15 Horacio Sanson 2006-05-24 11:46:02 UTC
Sorry for being so persistent but I really thing there is a bug here. I used id3v2 utility to convert my tags to id3v2 and amarok still displays garbage. I configured a font capable of displaying japanese text and even stripped out the id3v1 tag to make sure only the id3v2 tag was being read.

here is the procedure I used with no avail:

## Check the mp3 tag information
$ id3v2 -l test.mp3
id3v1 tag info for test.mp3:
Title  : Jupiter                         Artist: 平原綾香
Album  : Jupiter                         Year: 2003, Genre: Jpop (146)
Comment:                                 Track: 1

## Convert the id3 tag to version 2
$ id3v2 -C test.mp3
Converting id3v1 tag to id3v2 in test.mp3... converted

## Check the tag information to see the v2 tag info
$ id3v2 -l test.mp3
id3v1 tag info for test.mp3:
Title  : Jupiter                         Artist: 平原綾香
Album  : Jupiter                         Year: 2003, Genre: Jpop (146)
Comment:                                 Track: 1
id3v2 tag info for test.mp3:
TIT2 (Title/songname/content description): Jupiter
TPE1 (Lead performer(s)/Soloist(s)): 平原綾香
TALB (Album/Movie/Show title): Jupiter
TYER (Year): 2003
TRCK (Track number/Position in set): 1
TCON (Content type): Jpop (146)

## Now strip the id3v1 tag information
$ id3v2 -s test.mp3
Stripping id3 tag in "test.mp3"...id3v1 stripped.

## Finally one last check
$ id3v2 -l test.mp3
id3v2 tag info for test.mp3:
TIT2 (Title/songname/content description): Jupiter
TPE1 (Lead performer(s)/Soloist(s)): 平原綾香
TALB (Album/Movie/Show title): Jupiter
TYER (Year): 2003
TRCK (Track number/Position in set): 1
TCON (Content type): Jpop (146)


Rebuild my collection, loaded the file and I see this "å¹³å綾é¦" on the artist field everywhere in Amarok.

both id3info and id3v2 display the Artist name correctly in Kanji (japanese letters). I am pretty sure my encodings are in UTF-8 since I run a 100% UTF-8 system and the tags display ok in amarok 1.3.8 setting UTF-8 encoding.
Comment 16 Roman Cheplyaka 2006-05-24 14:21:29 UTC
Ryujin: though your system is "100% UTF-8", "id3v2" program probably expected to see ISO-8859-1-encoded tags, and it recoded them to utf8.. This is a problem I wrote about in comment #9.
Comment 17 NaGoS 2006-05-24 19:19:24 UTC
"..So really, amaroK is just following the spec, which says id3v1 tags should be ISO-8859-1..."
Spec is good, but Windows(TM) user and Windows(TM) programs not understand that. The best feature of Amarok 1.3.x is ID3 recodind. Without tag recoding we cant play 99% songs (all mp3 cd's have cp1251 tag encoding :( )
There are three solutions:
-give back tag recoding in 1.4.x
-use 1.3.x version
-use other players
Comment 18 NaGoS 2006-05-28 17:47:50 UTC
Rigth solution is here
http://rusxmms.sourceforge.net/
install
librcc
librcd
and patch id3lib (libtag) and you have all players with normal Russian words.
Comment 19 Joe Friedrichsen 2006-06-18 02:50:05 UTC
Version:           1.4.0a-1+b1 (using KDE KDE 3.5.2)
Installed from:    debian/etch
OS:                GNU/Linux

My Japanese tags show up as garbage as well, and I used both id3v2 and EasyTag to try an convert to UTF-8. Nothing changed. Instead, I used amaroK to edit the tag information, but things seem to be wrong still. 

amaroK correctly read the file name (seen in the bottom status bar), but the tag was garbage. I copied the file name information into the tag fields and now amaroK could correctly read the tag information. I went back to id3v2 to see what was written, and the id3v2 tag was removed, and junk was written to the changed tag fields:

## Check tag contents initially
:) friedrij@savoy:Night Food$ id3v2 -l "Ego Wrappin - 5月のクローバー.mp3"
id3v1 tag info for Ego Wrappin - 5月のクローバー.mp3:
Title  : 5月のクローバー          Artist: Ego Wrappin'
Album  : Night Food                      Year: 2002, Genre: Jazz (8)
Comment:                                 Track: 4
id3v2 tag info for Ego Wrappin - 5月のクローバー.mp3:
TIT2 (Title/songname/content description): 5月のクローバー
TPE1 (Lead performer(s)/Soloist(s)): Ego Wrappin'
TALB (Album/Movie/Show title): Night Food
TYER (Year): 2002
TRCK (Track number/Position in set): 04
TCON (Content type): Jazz (8)
TLEN (Length): 302000

## Check tag information in amaroK
Title:  5æã®ã¯ã­ã¼ãã¼

## Change tag information in amaroK
Title:  5月のクローバー

## amaroK now correctly displays the tag information in the window and OSD

## Re-check tag contents
:) friedrij@savoy:Night Food$ id3v2 -l "Ego Wrappin - 5月のクローバー.mp3"
id3v1 tag info for Ego Wrappin - 5月のクローバー.mp3:
Title  : n��                        Artist: Ego Wrappin'
Album  : Night Food                      Year: 2002, Genre: Jazz (8)
Comment:                                 Track: 4

So, it looks like amaroK is breaking the 'standard' it claims to defend.

Joe
Comment 20 Roman Cheplyaka 2006-06-18 08:01:20 UTC
Joe, the same here (with cyrillic). While amarok writes id3v1 tags, it also writes them in some strange encoding, that no other player or tag editor seems to understand. And if I write (via id3v2 or easytag) UTF-8-encoded id3v2 tags, amarok treats them as ISO-8859-1-encoded.
Comment 21 Kostas Mousi 2006-09-26 09:04:53 UTC
This is NOT a solution. I prefer tag recoding. I am going to downgrade or change player if tag recoding is not going to appear at 1.4.x
Comment 22 Horacio Sanson 2006-09-26 11:21:56 UTC
This seems to be true for all KDE applications. I have tries kid3 application for tagging and even the meta info editor in konqueror and all of them do not show the tags correctly. As I can understand all KDE applications use the id3lib library to manage mp3 tags so maybe the problem is there.
Comment 23 Leonid Morgun 2007-02-01 14:40:30 UTC
I think it's a wrong decision to remove this feature.
Comment 24 Sheng Yang 2007-02-03 04:45:36 UTC
It's a terrible decision for the people who using non-Latin coding, especially when we looking forward to amaroK 2.0...
Maybe we can make a filter after scanning the collection and do some recoding job. I think enhanced the connection between amaroK and some recoding software will be fine.
Comment 25 Idan Miara 2007-10-20 08:16:12 UTC
This bug is opened for 17 months...
it prevents many non-latin users (Hebrew,Russian...) from using amaroK 1.4.x.
many of downgraded to 1.3.9 for this reason alone.
I don't see a good reason why did someone removed this feature in the first place, so PLEASE fix it! at least for ver 2.0

thanks.
Comment 26 ua_igor@meta.ua 2008-08-11 10:38:13 UTC
>------- Additional Comment #18 From NaGoS 2006-05-28 17:47 ------- 
>Rigth solution is here 
>http://rusxmms.sourceforge.net/ 
>install 
>librcc 
>librcd 
>and patch id3lib (libtag) and you have all players with normal Russian words.
This formula is working. Cyrrilic is working fine in Amarok. Thank`s!
Comment 27 Jeff Mitchell 2008-08-11 21:34:20 UTC
#26, libtag and id3lib are not the same.  libtag is provided by TagLib.  libid3 is provided by id3lib, which is broken and unmaintained -- see below.

I haven't seen this bug before, but responding to e.g. #c11, #c19 and all,

There are many variations of the ID3v2 spec.  Not all variations support multi-byte character encoding, including UTF-8.

The tag library that Amarok uses, which is used by many, many other audio players (including on Windows) writes ID3v2.4 tags for many reasons, but one of the most important is that it supports UTF-8.

Many other players do *not* read ID3v2.4, usually depending on their tag library.  Anything using id3lib, for instance -- which I think includes easytag -- does *not* support anything after ID3v2.3.  The reason why is simple -- they are unmaintained.  The ID3v2.4 spec has been out since November of 2000 - 8 years.

It is not unreasonable of Amarok to decide that it wants to support a format that has been the current standard for 8 years.

So Amarok supports ID3v1 in the specified, correct encoding -- of which there is only one -- and supports ID3v2.4, which supports many encodings.  It will read ID3v2.3, but will not write it.  Other programs that are not 2.4-aware will neither read nor write 2.4, which is why you think Amarok is writing junk.  It's really that the tagging program/library you're using never modernized.

Adhering to the standards is a decision the Amarok developers have made and continue to support.  Now, where does that leave you guys:

1) If you're buying MP3 CDs, insist that they are *properly* tagged.  You, as a customer, would not be making an unreasonable request.  Or stop supporting merchants that don't adhere to standards.

2) If you're buying MP3-playing hardware, insist that they support proper tags.  Or don't buy it.  If you're a "proud owner" of a broken, non-standards-compliant player, maybe you should be less proud, and instead petition the company to give you a better product.

3) If you're adjusting tags yourself, you have to make a choice:
a) you can edit the tags elsewhere, and as long as they adhere to the encoding standards of whatever version of ID3 you are using, Amarok should read them properly
b) you can edit the tags elsewhere, and use encodings that are not part of the ID3 standards, and understand that when Amarok doesn't read them, it's because you're doing things with the ID3 tags you're not supposed to be doing
c) you can edit the tags in Amarok, and know that if they aren't showing up properly in other players, that the problem is that the player is eight years out of date in terms of tagging ability and get on their case

Companies won't have any reason to improve their products unless you make it clear you won't buy them until they do.  Which is why c) is a perfectly acceptable answer.

Also, if other players can't read the ID3v2 tag, they should still read the ID3v1 tag (which Amarok also writes) -- but this will be in the ISO-8859-1 encoding that is specified by the standard.

Standards exist for a reason -- to make things interoperable.  The issues you are encountering are exactly why standards exist.

FWIW, the best standalone tagger out there is eyeD3.  It handles v1, v2.3, and v2.4 with aplomb.
Comment 28 Ian Monroe 2008-08-11 21:41:31 UTC
Though have no fear: Amarok 2 will have decent support for your poorly tagged music. We imported some code from Mozilla to figure out what encoding it is.
Comment 29 Roman Cheplyaka 2008-09-22 05:59:06 UTC
*** This bug has been confirmed by popular vote. ***
Comment 30 Myriam Schweingruber 2008-09-22 11:33:04 UTC
please do NOT report to that bug anymore, as Amarok 1.4.0 and KDE 3.5.0 are completely outdated! The current stable versions are Amarok 1.4.10 and KDE 3.5.10, so please update your versions.

Anyhow, the developers currently focus on Amarok 2 to be released soon, so feel free to reopen this bug should it persist in Amarok 2.