Bug 81400

Summary: Cannot display file info written in local characters rather than UTF-8
Product: [Applications] kfile-plugins Reporter: Funda Wang <fundawang>
Component: mp3Assignee: Multimedia Developers <kde-multimedia>
Status: CONFIRMED ---    
Severity: normal CC: lucida, michal, sng
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Attachments: lucida's patch on this issue
Patch to convert from TString to QString and QString to TString
Snapshot showing a tag

Description Funda Wang 2004-05-12 06:09:53 UTC
Version:            (using KDE Devel)
Installed from:    Compiled sources

kfile_mp3 always extract and treat the meta info using UTF-8, which is the native encoding of KDE. But almost all the files with built-in ID3v1 tag are using the local characters rather than UTF-8 encoding to store. Unfortunately, kfile_mp3 does not handle this.
Comment 1 Funda Wang 2004-05-12 06:11:22 UTC
Created attachment 5963 [details]
lucida's patch on this issue

It works for Chinese. If it wouldn't affect other languages, please accept it.
Comment 2 leo zhu 2004-05-12 08:41:01 UTC
Hi, I didn't submit this patch because it's kind of "dirty hack". In fact, it's not the problem of kfile-plugins, but the taglib's. You may wanna check this:

http://bugs.kde.org/show_bug.cgi?id=78200

The best way is to add an option and let user choose the encoding for their id3(as the new kio-ftp does), but before this, I think it's better to leave thing untouched.

Comment 3 Scott Wheeler 2004-05-12 14:50:28 UTC
Well, these patches definitely won't work in a lot of cases.  Specifically ID3v2 does properly support Unicode and this would mangle the output coming back.

Also using the current locale won't work in a lot of cases because then you won't be able to display files that you get from someone outside of your locale, or things will break if you switch from one locale on your system to another (say you upgrade your distro and it's using UTF8 -- all of the sudden you won't be able to read your tags anymore).

The basic problem is that ID3v1 is a bad format -- it doesn't support text encodings other than ISO-8859-1, but that hasn't stopped people from throwing arbitrary data in there.

The only acceptable hack in my opinion is to make this something that's selectable by the user.

Please also see:

http://bugs.kde.org/show_bug.cgi?id=78428
http://bugs.kde.org/show_bug.cgi?id=77710
Comment 4 Waldo Bastian 2004-05-25 22:30:19 UTC
You can check if something is valid utf-8 and fall back to locale if not.
See KStringHandler::isUtf8(const char *buf)

Comment 5 Scott Wheeler 2004-05-26 04:18:42 UTC
> You can check if something is valid utf-8 and fall back to locale if not.

Well, but actually both of those are incorrect.  :-)  The original report is actually incorrect -- the plugin doesn't currently assume utf-8 as most programs write either ISO-8859-1 or the local encoding.  This gets messy because the current locale may not be the locale at the time or machine that the file was tagged.  (And since, well, a lot of mp3 files are downloaded this just complicates things.)
Comment 6 Scott Wheeler 2004-06-01 19:35:02 UTC
*** Bug 82640 has been marked as a duplicate of this bug. ***
Comment 7 mrudolf 2004-06-22 13:49:40 UTC
Input charsets problems may be difficult to solve (one has to guess if file was encoded in local encoding or UTF-8.

But worse is that even editing tags is broken (entered local characters seem to be autoconverted to latin1). This surely could be fixed, as it is possible to check for current charset.
Comment 8 Scott Wheeler 2004-06-23 12:28:12 UTC
*** Bug 83821 has been marked as a duplicate of this bug. ***
Comment 9 Salatas John 2004-11-20 10:16:50 UTC
Created attachment 8342 [details]
Patch to convert from TString to QString and QString to TString
Comment 10 Salatas John 2004-11-20 10:19:11 UTC
Created attachment 8343 [details]
Snapshot showing a tag
Comment 11 Salatas John 2004-11-20 10:25:23 UTC
Comment on attachment 8342 [details]
Patch to convert from TString to QString and QString to TString

With this patch is possible to display a unicode tag correctly and whe saving a
tag convert it in utf-8. It works for me. See the seconde attach I made for a
screenshot.
Comment 12 Scott Wheeler 2004-11-20 13:16:36 UTC
Right, but that won't work for tags that are actually valid.
Comment 13 Funda Wang 2004-11-20 13:57:48 UTC
Then, a kcm module would be nice.
Comment 14 Funda Wang 2004-12-25 14:53:08 UTC
How is this going?
Comment 15 Thiago Macieira 2004-12-25 15:01:32 UTC
Don't get your hopes up. You're asking the developer to break the compliant behaviour so that broken files can be read.
Comment 16 Funda Wang 2005-07-03 04:02:20 UTC
What is your apptitude towards an environment variable?
Comment 17 Funda Wang 2005-07-03 04:04:25 UTC
GStreamer has populated[1] a environment variable GST_ID3_TAG_ENCODING used to phrase the id3v1 tags. It would be a good start if kdemultimedia could use that encoding to phrase id3v1 tags.

[1] http://bugzilla.gnome.org/show_bug.cgi?id=149274#c17
Comment 18 Justin Zobel 2021-03-09 23:51:12 UTC
Thank you for the bug report.

As this report hasn't seen any changes in 5 years or more, we ask if you can please confirm that the issue still persists.

If this bug is no longer persisting or relevant please change the status to resolved.