Bug 280131 - Taglib String internal format not guaranteed UTF16BE on LE platform
Summary: Taglib String internal format not guaranteed UTF16BE on LE platform
Status: RESOLVED WORKSFORME
Alias: None
Product: taglib
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 1.7
Platform: Unlisted Binaries Microsoft Windows
: NOR normal
Target Milestone: ---
Assignee: Scott Wheeler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-15 16:50 UTC by fbeguec
Modified: 2023-01-19 05:15 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description fbeguec 2011-08-15 16:50:37 UTC
Version:           1.7
OS:                MS Windows

String::Prepare(Type t) function wrongly checks the BOM when passing UTF16 type.
if(d->data.size() >= 1 && (d->data[0] == 0xfeff || d->data[0] == 0xfffe)) {
      bool swap = d->data[0] != 0xfeff;
If input data is Little Endian on a Little Endian system, no swap will be performed, as on a LE system, a 16 bit value containing [byte0 = 0hFF, byte 1 = 0hFE] will be interpreted as 0xFEFF.
So this means that using this code, internal representation is the same endianness as the platform on which it's used. The documentation says that internal representation of TagLib::String is UTF16BE.


Reproducible: Always

Steps to Reproduce:
store a Little Endian string with BOM prefix in a wchar_t buffer and construct a new TagLib::String object specifying the UTF16 Type.


Actual Results:  
when debug tracing, you'll notice that the data won't be converted as it should be according the doc. internal data buffer will remain UTF16 LE.

Expected Results:  
Each wchar_t input elements's bytes should be swapped.

A fix could be, in tstring.cpp:
void String::prepare(type t)
  switch(t) {
  case UTF16:
  {
    if(d->data.size() >= 1 && (d->data[0] == 0xfeff || d->data[0] == 0xfffe)) {
      bool swap = d->data.data()[0] == '\xff';
Comment 1 fbeguec 2011-08-15 17:09:49 UTC
The fix I proposed doesnt work, it should rather be :
void String::prepare(type t)
  switch(t) {
  case UTF16:
  {
    if(d->data.size() >= 1 && (d->data[0] == 0xfeff || d->data[0] == 0xfffe)) {
      bool swap = ((const char*)(d->data.c_str()))[0] == '\xff';


(In reply to comment #0)
> Version:           1.7
> OS:                MS Windows
> 
> String::Prepare(Type t) function wrongly checks the BOM when passing UTF16
> type.
> if(d->data.size() >= 1 && (d->data[0] == 0xfeff || d->data[0] == 0xfffe)) {
>       bool swap = d->data[0] != 0xfeff;
> If input data is Little Endian on a Little Endian system, no swap will be
> performed, as on a LE system, a 16 bit value containing [byte0 = 0hFF, byte 1 =
> 0hFE] will be interpreted as 0xFEFF.
> So this means that using this code, internal representation is the same
> endianness as the platform on which it's used. The documentation says that
> internal representation of TagLib::String is UTF16BE.
> 
> 
> Reproducible: Always
> 
> Steps to Reproduce:
> store a Little Endian string with BOM prefix in a wchar_t buffer and construct
> a new TagLib::String object specifying the UTF16 Type.
> 
> 
> Actual Results:  
> when debug tracing, you'll notice that the data won't be converted as it should
> be according the doc. internal data buffer will remain UTF16 LE.
> 
> Expected Results:  
> Each wchar_t input elements's bytes should be swapped.
> 
> A fix could be, in tstring.cpp:
> void String::prepare(type t)
>   switch(t) {
>   case UTF16:
>   {
>     if(d->data.size() >= 1 && (d->data[0] == 0xfeff || d->data[0] == 0xfffe)) {
>       bool swap = d->data.data()[0] == '\xff';
Comment 2 Lukáš Lalinský 2011-08-15 17:34:25 UTC
I'd say the correct fix is to change the documentation.
Comment 3 fbeguec 2011-08-16 06:28:33 UTC
(In reply to comment #2)
> I'd say the correct fix is to change the documentation.

Well, I'd say it wouldn't be consistent, considering that if you use the same String constructor with type t = UTF16LE, bytes are indeed swapped to be internally stored as UTF16BE this time.
Comment 4 Andrew Crouthamel 2018-11-06 15:10:25 UTC
Dear Bug Submitter,

This bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? I am setting the status to NEEDSINFO pending your response, please change the Status back to REPORTED when you respond.

Thank you for helping us make KDE software even better for everyone!
Comment 5 Andrew Crouthamel 2018-11-18 03:30:55 UTC
Dear Bug Submitter,

This is a reminder that this bug has been stagnant for a long time. Could you help us out and re-test if the bug is valid in the latest version? This bug will be moved back to REPORTED Status for manual review later, which may take a while. If you are able to, please lend us a hand.

Thank you for helping us make KDE software even better for everyone!
Comment 6 Justin Zobel 2022-12-20 22:51:40 UTC
Thank you for reporting this issue in KDE software. As it has been a while since this issue was reported, can we please ask you to see if you can reproduce the issue with a recent software version?

If you can reproduce the issue, please change the status to "REPORTED" when replying. Thank you!
Comment 7 Bug Janitor Service 2023-01-04 05:25:31 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 8 Bug Janitor Service 2023-01-19 05:15:52 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!