Summary: | i18n: ability to choose encoding of tag info | ||
---|---|---|---|
Product: | [Applications] amarok | Reporter: | Nick Shaforostoff <shafff> |
Component: | general | Assignee: | Amarok Developers <amarok-bugs-dist> |
Status: | RESOLVED FIXED | ||
Severity: | wishlist | CC: | berk, ggcand |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | unspecified | ||
OS: | Other | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
utf8-1.cpp
combobox for selecting tag charset Patch for today CVS Script for encoding conversion. |
Description
Nick Shaforostoff
2004-03-15 21:39:27 UTC
This should be added to taglib. Amarok has nothing to do with tags encodings. Well, and ID3v1, dumb as it is (and as broken as the implementation is, well, everywhere) is *supposed* to be ISO-8859-1 everywhere. The problem is that there's no way to specify the encoding along with the file for MP3s. Some things put UTF-8 in there, some put ISO-8859-1 in there (as they should -- even though I know that sucks). Some put the current locale... But that all breaks down as soon as you copy the files to a machine with a different locale or are running two OS'es with different locales set (this is common in many regions -- Linux defaults to one locale and Windows to another). TagLib just implements ID3v1's brokenness; it's really a bug in ID3v1, not a bug in TagLib (though you can treat the string coming from TagLib as whatever locale you want to -- good luck figuring out what the right one is though). CVS commit by wheeler: Add the ability to override the default ID3v1 string handling in TagLib by adding a ID3v1::StringHandler class that has render and parse methods. By default this still makes the (correct) assumption that ID3v1 tags contain ISO-8859-1 data, but this makes it easy for applications to override this to allow user specified codecs. CCMAIL:78428@bugs.kde.org CCMAIL:77710@bugs.kde.org CCMAIL:amarok-devel@lists.sourceforge.net CCMAIL:Shlomi Loubaton <loubaton.shlomi@012.net.il> CCMAIL:Nir Misgav <misgav@ee.bgu.ac.il> M +34 -10 id3v1tag.cpp 1.23 M +47 -0 id3v1tag.h 1.14 Ok, and now I'm attaching a small Qt-based program that shows how to override
the default string encoding. This one just uses utf8, but I think it's clear
how one would proceed based on this...
I would also recommend by default (i.e. when ISO-8859-1 is selected) not using
an external string handler at all since they extra lookups will likely slow
things down...
- -Scott
On Sunday 04 April 2004 16:42, Scott Wheeler wrote:
> CVS commit by wheeler:
>
> Add the ability to override the default ID3v1 string handling in TagLib by
> adding a ID3v1::StringHandler class that has render and parse methods.
>
> By default this still makes the (correct) assumption that ID3v1 tags contain
> ISO-8859-1 data, but this makes it easy for applications to override this to
> allow user specified codecs.
>
> CCMAIL:78428@bugs.kde.org
> CCMAIL:77710@bugs.kde.org
> CCMAIL:amarok-devel@lists.sourceforge.net
> CCMAIL:Shlomi Loubaton <loubaton.shlomi@012.net.il>
> CCMAIL:Nir Misgav <misgav@ee.bgu.ac.il>
>
>
> M +34 -10 id3v1tag.cpp 1.23
> M +47 -0 id3v1tag.h 1.14
>
>
>
>
Created an attachment (id=5530)
utf8-1.cpp
Heh -- forgot to mark this as "done" earlier. From the TagLib side of things this is done, so I'm marking it as such. You may however want to reopen and reassign it to amarok. Created attachment 5929 [details]
combobox for selecting tag charset
cd /path/to/amarok-cvs
patch -p1 < /path/to/amarok-tr-0.4.patch
make -f admin/Makefile.common cvs
./configure --prefix=`kde-config --prefix` --disable-debug && make && make
install
please, reassign this to amarok... I think I'll take a look at this when I find some time. I do now understand the problem and I must say I'm quite concerned with l10n issues. Have to ask on #amarok where to put the config option. *** Bug 80911 has been marked as a duplicate of this bug. *** Comment on attachment 5929 [details] combobox for selecting tag charset Note that the patch implementing the combobox seems to be from Andrew Lavrinenko <ggcand@aaanet.ru>. See #80911 for details. At least, this patch was introduced to me by man with nikname 'Light'. Anyways, the patch seems to be broken - there exist reference to Options6.h but there is no its implemenatation here is patch for 1.0 http://www.redroot.ru/000/amarok-tr-0.5.patch but it isnt complete. at least there is no encoding handling on collection browser and when i edit russian tag, it doesnt do reencoding. and i dont know wether this is taglib bug or no Hello! Sorry for my long time absent, i have no time. I think, patch is compleat. Now it recode tag's in CollectionDB, correctly read AudioProperties, correctly save tag's if it editing, i move this option to "font" form. But exist one small trable, if file content both ID3v1 and ID3v2 tag's then amarok use only ID3v1. PS. You must include it to main realse:) It's very importent for non english speaking country! PPS. Sorry for my english:) Created attachment 6947 [details]
Patch for today CVS
got it but when i try to compile it, it gives me error In file included from app.cpp:28: id3v1stringhandler.h:16: error: base class `TagLib::ID3v1::StringHandler' has incomplete type app.cpp: In member function `void App::applySettings(bool)': app.cpp:341: error: 'class TagLib::ID3v1::Tag' has no member named 'setStringHandler' taglib12, qt33 can somebody confirm this or its just suckness of my mdk10? andrew, thanks for the patch! I'll review and apply it either today or tomorrow :-) And yes, we'll get it into 1.1 of course. 2 Max Howell BIG BIG Thenx!:) 2 Nick Shaforostoff It's correctly work on taglib1.1 and taglib1.2 Sorry, i found small bag in my patch. Please replasce in id3v1stringhandler.cpp ID3v1StringHandler::ID3v1StringHandler(int codecIndex) : m_codec(QTextCodec::codecForName("CP1251")) {} to ID3v1StringHandler::ID3v1StringHandler(int codecIndex) : m_codec(QTextCodec::codecForIndex(codecIndex)) {} CVS commit by mhowell: * With much thanks to Andrew Lavrinenko and Mr Scott-TagLib-Wheeler, amaroK can now display ID3v1 tags in a user-specified 8-bit encoding. Please test (since I have not the facility) and suggest modifcations. Thanks. * Note: I note with glee that KDElibs 3.3 supports XT using combo boxes and group boxes with checkboxes. So you need KDE 3.3 to activate this feature with the GUI. I figure 1.1 will likely require 3.3 anyway, if it doesn't I'll hack it backwards to work with 3.2. * Also fixed in this commit: amaroK recognises session restored tracks again CCMAIL: 77710-done@bugs.kde.org M +246 -194 Options2.ui 1.27 M +28 -11 app.cpp 1.376 M +0 -1 app.h 1.150 M +24 -16 configdialog.cpp 1.39 M +28 -16 enginecontroller.cpp 1.50 M +5 -3 enginecontroller.h 1.24 M +11 -1 metabundle.h 1.31 Andrew, I didn't commit the mimetype detection stuff in MetaBundle. But have plans to at a future time. Thanks for the work! Max I use mimetype detection only for override id3v2 to id3v1 tags becose almost all mp3 files content both id3v1 and id3v2 tags, but id3v2 not in unicode charset:( How to resolve this problem - i don't know. It seems that amarok 1.2beta2 can recode id3v1 in Chinese perfectly. But for id3v2, the characters are just being known as iso-8859-1. Maybe another dropdown list for id3v2? There still a problem with id3v1 and id3v2 character encodings. Unlike it appears in comment 13, id3v_2_ is used when both present and there is no option to alter this behavior. Can you please add similar solution as you've done for id3v1 already? Thank you very much for a wonderful player. we already discussed this id3v2 must be in utf (so then we need a simple tool to convert missing tags from cp1251 to utf8 and it might be a little button in 'edit metainfo' dialogue...) Created attachment 9410 [details]
Script for encoding conversion.
Description: Recodes id3v2 tag to/from any encoding iconv supports.
Depends: perl, iconv, id3v2.
Note: May be vulnerable for filenames containing "'", "`" etc.
For those who might need this. You may want to put it somewhere on your site or
include with player.
Beep media player has a option "Prefer id3v1 than id3v2". Maybe it is rather simple solution. Prefer ID3v1 over ID3v2 if the user specifically requests that mp3 ID3v1 tags get recoded. To me this doesn't seem like a solution that should work, but I can't argue with the number of requests for it, and the number of users who claim recode tags doesn't work because the id3v2 tag is preferred. Please, please someone test this! I have nothing to test it with and cannot be sure that it'll work. (The common case still works though, that is tested) BUG:77710 Well, for me it doesn't work - when amarok tries to play an MP3 with ID3v1 (if I have selected the recode entry) amarok crashes - sorry, I have no debug infos - but when I turn off that recoding stuff, it works again freddy CVS commit by mhowell: Don't crash if the user wants to recode tags CCMAIL: 77710@bugs.kde.org M +33 -25 metabundle.cpp 1.44 --- kdeextragear-1/amarok/src/metabundle.cpp #1.43:1.44 @@ -81,9 +81,8 @@ MetaBundle::MetaBundle( const PlaylistIt { if( m_url.protocol() == "file" ) - { readTags( TagLib::AudioProperties::Accurate ); - } - else { //is Stream + else { + // is a stream //FIXME not correct handling, say is ftp://file m_bitrate = item->exactText( 10 ).left( 3 ).toInt(); @@ -151,36 +150,45 @@ MetaBundle::readTags( TagLib::AudioPrope return; - TagLib::FileRef f( QFile::encodeName( m_url.path() ), true, readStyle ); + const QString path = m_url.path(); + TagLib::FileRef fileref; + TagLib::Tag *tag = 0; - if( !f.isNull() && f.tag() ) + if( AmarokConfig::recodeID3v1Tags() && path.endsWith( ".mp3", false ) ) { - TagLib::Tag *tag = 0; + TagLib::MPEG::File *mpeg = new TagLib::MPEG::File( QFile::encodeName( path ), true, readStyle ); + fileref = TagLib::FileRef( mpeg ); - if( AmarokConfig::recodeID3v1Tags() && m_url.path().endsWith( ".mp3", false ) ) { + if( mpeg->isValid() ) // we prefer ID3v1 over ID3v2 if recoding tags because // apparently this is what people who ignore ID3 standards want - TagLib::MPEG::File &mp3 = (TagLib::MPEG::File&)f; - tag = mp3.ID3v1Tag() ? (TagLib::Tag*)mp3.ID3v1Tag() : (TagLib::Tag*)mp3.ID3v2Tag(); + tag = mpeg->ID3v1Tag() ? (TagLib::Tag*)mpeg->ID3v1Tag() : (TagLib::Tag*)mpeg->ID3v2Tag(); } - else - tag = f.tag(); - #define bing( x ) TStringToQString( x ).stripWhiteSpace() - m_title = bing( tag->title() ); - m_artist = bing( tag->artist() ); - m_album = bing( tag->album() ); - m_comment = bing( tag->comment() ); - m_genre = bing( tag->genre() ); + else { + fileref = TagLib::FileRef( QFile::encodeName( path ), true, readStyle ); + + if( !fileref.isNull() ) + tag = fileref.tag(); + } + + if( !fileref.isNull() && tag ) { + #define strip( x ) TStringToQString( x ).stripWhiteSpace() + m_title = strip( tag->title() ); + m_artist = strip( tag->artist() ); + m_album = strip( tag->album() ); + m_comment = strip( tag->comment() ); + m_genre = strip( tag->genre() ); m_year = tag->year() ? QString::number( tag->year() ) : QString(); m_track = tag->track() ? QString::number( tag->track() ) : QString(); - #undef bing - - init( f.audioProperties() ); + #undef strip m_isValidMedia = true; + + init( fileref.audioProperties() ); } -/*FIXME disabled for beta4 as it's simpler to not got 100 bug reports - else if( KMimeType::findByUrl( m_url )->is( "audio" ) ) - init( KFileMetaInfo( m_url, QString::null, KFileMetaInfo::Everything ) );*/ + + //FIXME disabled for beta4 as it's simpler to not got 100 bug reports + //else if( KMimeType::findByUrl( m_url )->is( "audio" ) ) + // init( KFileMetaInfo( m_url, QString::null, KFileMetaInfo::Everything ) ); } Thanks Max it's work fine, but some bug exist since adding to configdialog.cpp:95 ... m_opt1->kcfg_RecodeEncoding->insertItem( QString::null ); ... Now the codec index not comply with they name in kcfg_RecodeEncoding Maybe you delete this code or edit the applySettings in app.cpp like ... if ( AmarokConfig::recodeID3v1Tags() ) TagLib::ID3v1::Tag::setStringHandler( new ID3v1StringHandler( AmarokConfig::recodeEncoding()-1 ) ); ... Woah, good point. Sorry. I added the null string so people didn't panic that their tags were being recoded to something weird. And KConfigXT isn't flexible enough to let me set it to utf8 (or I couldn't make it work). Will fix asap. CVS commit by mhowell: * Extremely unfortunate bug that is the consequence of none of the regular developers using this feature at all. For some reason if the user recoded id3v1 tags using QTextCodec #0, it caused a crash. So I just disabled that codec and tested a random selection of the rest, seems to be fixed. * This is also convenient since it means I can get round the KConfigXT-doesn't-like-ComboBoxes bug that has remained unfixed since KDE 3.2 *sigh* BUG:95041 BUG:77710 CCMAIL:amarok-devel@lists.sf.net M +16 -6 app.cpp 1.478 M +6 -4 configdialog.cpp 1.72 --- kdeextragear-1/amarok/src/app.cpp #1.477:1.478 @@ -341,10 +341,8 @@ void App::initGlobalShortcuts() //here and save creating another header/source file combination -class ID3v1StringHandler : public TagLib::ID3v1::StringHandler { +class ID3v1StringHandler : public TagLib::ID3v1::StringHandler +{ QTextCodec *m_codec; -public: - ID3v1StringHandler( int codecIndex ) - : m_codec( QTextCodec::codecForIndex( codecIndex ) ) - {} + virtual TagLib::String parse( const TagLib::ByteVector &data ) const { @@ -356,4 +355,12 @@ public: return TagLib::ByteVector( qcs, qcs.length() ); } + +public: + ID3v1StringHandler( int codecIndex ) + : m_codec( QTextCodec::codecForIndex( codecIndex ) ) + { + debug() << "codec: " << m_codec << endl; + debug() << "codec-name: " << m_codec->name() << endl; + } }; @@ -417,5 +424,8 @@ void App::applySettings( bool firstTime m_pTray->setShown( AmarokConfig::showTrayIcon() ); - if ( AmarokConfig::recodeID3v1Tags() ) + + // we check > 0 because textCodecForIndex( 0 ) crashes amaroK for unknown + // reasons, also now we assign index 0 to "" in the config combobox + if( AmarokConfig::recodeID3v1Tags() && AmarokConfig::recodeEncoding() > 0 ) TagLib::ID3v1::Tag::setStringHandler( new ID3v1StringHandler( AmarokConfig::recodeEncoding() ) ); --- kdeextragear-1/amarok/src/configdialog.cpp #1.71:1.72 @@ -85,5 +85,5 @@ AmarokConfigDialog::AmarokConfigDialog( KTrader::OfferList offers = PluginManager::query( "[X-KDE-amaroK-plugintype] == 'engine'" ); - for ( KTrader::OfferList::ConstIterator it = offers.begin(); it != offers.end(); ++it ) { + for( KTrader::OfferList::ConstIterator it = offers.begin(); it != offers.end(); ++it ) { m_soundSystem->insertItem( (*it)->name() ); // Save name properties in QMap for lookup @@ -92,8 +92,10 @@ AmarokConfigDialog::AmarokConfigDialog( } - // ID3v1 recoding locales - m_opt1->kcfg_RecodeEncoding->insertItem( QString::null ); + // KConfigXT doesn't like Comboboxes, still, 2 versions after + // it was created. How shit. Hence we can't make recodeEncoding + // a string (which we need to because by index is not consistent) QTextCodec *codec; - for ( int i = 0; ( codec = QTextCodec::codecForIndex( i ) ); i++ ) + m_opt1->kcfg_RecodeEncoding->insertItem( QString::null ); + for( int i = 1; (codec = QTextCodec::codecForIndex( i )); i++ ) m_opt1->kcfg_RecodeEncoding->insertItem( codec->name() ); |