Bug 70375 - chinese ID3 display problem
Summary: chinese ID3 display problem
Status: RESOLVED INTENTIONAL
Alias: None
Product: juk
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Mandrake RPMs Linux
: NOR wishlist
Target Milestone: ---
Assignee: Scott Wheeler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-12-14 11:13 UTC by Nicolaus Yahzee
Modified: 2004-03-17 17:51 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolaus Yahzee 2003-12-14 11:13:27 UTC
Version:           3.1.94 (using KDE KDE 3.1.94)
Installed from:    Mandrake RPMs
OS:          Linux

i know that juk seems to be going to support UTF-8 in the future and this will help to display many international id3 tags
but i wish Juk can also support many other locales such simplified chinese gb2312, gbk or gb18030 etc.
at present JuK can not display a single chinese.

and i am the i18n chinese translator of JuK
so i really wish it can display chinese characters well
i believe it can be many chinese KDE users' choice by then

regards!
Comment 1 Scott Wheeler 2003-12-14 15:00:18 UTC
Sorry, but the ID3 format doesn't support those encodings.  It doesn't make much sense for JuK to support them if the file format doesn't.

ID3v1 is supposed to be ISO-8859-1 only, though many broken tag editors ignore this.  I have a small tool here http://developer.kde.org/~wheeler/files/tag-unbreaker.cpp that converts these tags from a given locale to UTF8.

ID3v2 supports UTF16, UTF8 and ISO-8859-1.  JuK defaults to using UTF8.
Comment 2 Nicolaus Yahzee 2003-12-14 16:24:21 UTC
thanks for answering 
but one thing i can't get is that:
if ID3v1 support ISO-8859-1 only, how can we non-latin language users have so many mp3 files with non-ISO-8859-1 encoding tags?
and how can xmms and winamp and many other windows apps display those characters correctly.

i am asking some chinese users this too maybe they can provide more information about this problem.
I found Rhythmbox bug track has similer bug report too.

http://bugzilla.gnome.org/show_bug.cgi?id=97856
Comment 3 Scott Wheeler 2003-12-14 16:44:12 UTC
Subject: Re:  chinese ID3 display problem

On Sunday 14 December 2003 16:24, Nicolaus Yahzee wrote:

> thanks for answering 
> but one thing i can't get is that:
> if ID3v1 support ISO-8859-1 only, how can we non-latin language users have 
> so many mp3 files with non-ISO-8859-1 encoding tags? 
> and how can xmms and winamp and many other windows apps display those 
> characters correctly. 

Well, again, the problem is that ID3v1 is only *supposed* to have ISO-8859-1 
-- the problem is that a lot of players just ignore this and write utf8 or 
the current locale.

Now the problem is that if you copy these files to another machine, it's 
unclear what locale they're encoded in.

I even tried to do some guessing a while back but the problem is that without 
a text sample of larger than the size of an ID3v1 tag it's very difficult to 
guess the encoding accurately.

> i am asking some chinese users this too maybe they can provide more 
> information about this problem.

I understand the problem -- it's just that there's no easy way to fix it since 
JuK is actually doing the right thing.

The basic issue is that there's no where in an ID3v1 tag to specify what 
encoding it's using.  The original authors always used ISO-8859-1 and this 
became the closest thing to a "standard" that there is.  Some people then 
tried to "localize" it by using different encodings, but then it's completely 
useless as "meta-data" since it won't work if you move it to another machine.

ID3v2 came along and supports multiple encodings and specifically it supports 
Unicode.

The basic problem with this whole issue is that if I "fix" this for you it 
will break for others.  (i.e. someone in eastern europe downloads an mp3 and 
it decodes all of the characters as ISO-8859-2 when they should be 
ISO-8859-1...)

> I found Rhythmbox bug track has similer bug report too.
> 
> http://bugzilla.gnome.org/show_bug.cgi?id=97856

Yes, and as you notice he suggests converting the tags to ID3v2 as well.  :-)  
I've talked to Colin about this in the past...

Comment 4 Alexei Dets 2004-02-13 22:52:58 UTC
The same problem here with russian - most russian mp3 files are tagged in windows-1251 encoding - but Unix (in particulary, Linux) usually defaults to koi8-r or utf-8. This means that tags are displayed incorrectly.

Can you add a _manual_ tag encoding selector (like in text editors, konqueror etc.)? Ideally if it will be possible to do this not only globally but remember also on a per-folder/per-file basis - though even generic selector will be GREAT!!!

Current situation is just awful :-(((
Comment 5 Gary 2004-03-17 17:51:28 UTC
While juk is definitly a great program and is dealing with this ID3 issue the right way, can we still have a way to display international characters as a temp workaround? It would provide a much easier time for people migrating from windows. And since juk is now mainstream with kde3.2+, it would really be useful for a larger number of people with a minimal change of code? Thanks