Bug 149468 - taglib destroys id3v2.4.0 Tags, by forgetting the null-bytes in Textstrings
Summary: taglib destroys id3v2.4.0 Tags, by forgetting the null-bytes in Textstrings
Status: RESOLVED NOT A BUG
Alias: None
Product: taglib
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: unspecified
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: Scott Wheeler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-09-02 09:42 UTC by Mark
Modified: 2007-09-03 15:17 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mark 2007-09-02 09:42:22 UTC
Version:            (using KDE KDE 3.5.7)
Installed from:    Gentoo Packages

I'm using kid3, which uses taglib for id3v2.4 editing and this seems to destoy my Tags.
According to the specification, frames which contain Text have to be null-terminated, however the text-frames created here don't have this termination.

From a Hex-Viewer:

49 44 33 04 00 00 00 00 08 6D 54 49 54 32 00 00     ID3......mTIT2..
00 06 00 00 00 54 65 73 74 31 54 50 45 31 00 00     .....Test1TPE1..
00 06 00 00 00 54 65 73 74 32 54 41 4C 42 00 00     .....Test2TALB..
and so on.

As can be seen the Tags (TPE1,TALB...) Follow immediately after the text. The necessary Null-Byte as termination marker is missing.
Comment 1 Scott Wheeler 2007-09-03 13:55:47 UTC
I'm afraid you misunderstood the standard.  Null is used as a field separator in a list with multiple values in ID3v2.4, not as a terminator.
Comment 2 Mark 2007-09-03 14:31:21 UTC
Null is also the terminator!

From the id3v2.4.0-structure document:

>Frames that allow different types of text encoding contains a text
>encoding description byte. Possible encodings:

>   $00   ISO-8859-1 [ISO-8859-1]. Terminated with $00.
>   $01   UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All
>         strings in the same frame SHALL have the same byteorder.
>         Terminated with $00 00.
>   $02   UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM.
>         Terminated with $00 00.
>   $03   UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.

As can be seen, all strings (which allow an encoding) have to be terminated by a Null-byte.
This applies to all Text-frames "T...". Besides all of these text-frames can contain multiple values. If there is only one value (=a list with only one element), this doesn't change the fact, that this single element also has to be terminated!
Comment 3 Scott Wheeler 2007-09-03 15:17:56 UTC
I'm sorry, you're simply wrong on this.  This has been discussed multiple times on the ID3v2 mailing list.  A null terminated string in ID3v2.4 is technically incorrect (though many implementations do so) as it represents a list of ("value", "").  This is a change from the ID3v2.3 semantics where the terminator was explicitly optional.

The text which you quoted explains what the termination character is when strings are terminated (since it's encoding dependent), but if you read through the rest of the spec you'll notice that all places where terminators are needed they are explicitly listed in the byte listings.