Bug 304015

Summary: Corrupted FLAC files scan can result in heavy CPU consumption
Product: [Frameworks and Libraries] taglib Reporter: arnaud.bienner
Component: generalAssignee: Scott Wheeler <wheeler>
Status: RESOLVED FIXED    
Severity: major CC: adaptee, arnaud.bienner, info, lalinsky
Priority: NOR    
Version: 1.7   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
URL: http://code.google.com/p/clementine-player/issues/detail?id=3069
Latest Commit: Version Fixed In:
Attachments: Consider FLAC file as being invalid if a 0 length block is found
Faster FLAC::FilePrivate destructor

Description arnaud.bienner 2012-07-24 17:16:30 UTC
In Clementine (http://www.clementine-player.org/) we use TagLib to read files tags.

For corrupted FLAC, TagLib hangs (corrupted files are attached in our bug report)

After looking at the code, the problem seems to be that TagLib is trying to read invalid data:
https://github.com/taglib/taglib/blob/master/taglib/flac/flacfile.cpp#L422
Something like data block with only 0 set. So |isLastBlock| is never set, and length is 0.
So TagLib try to read a 0 length block, which doesn't raise any error, and then allocates a new UnknownMetadataBlock here:
https://github.com/taglib/taglib/blob/master/taglib/flac/flacfile.cpp#L461
Then it goes to the next header block (actually what it guesses is the next header block).

Finally, it takes lot of time (i.e. time to read all the file) to realize that the FLAC file is invalid.
Afterwards, it spends lot of in the destructor deleting each block (and there is lot of blocks to delete at the end).

By the way, the destructor can be optimized by doesn't computing |size()| each time.
In my case, |size| on is linear, not constant. Maybe it is because I compiled TagLib in debug mode and that some optimizations weren't set in that case, or because the underlying data structure is a std::list, but it's actually slow down TagLib a lot.
However, in valid usecases, I think there are not so many blocks, so this may sounds superfluous. 

But IMHO a metadata block with 0 length doesn't make sense, so if we find one, probably something is wrong with this file and we should consider it as invalid.

I wrote a patch which consider a file as invalid if it finds a 0 length block.
TagLib tests are still OK with my patch, and my personal 70 flacs files (valid) give the same result with or without the patch, so I believe it doesn't break anything.

Reproducible: Always
Comment 1 arnaud.bienner 2012-07-24 17:21:32 UTC
Created attachment 72730 [details]
Consider FLAC file as being invalid if a 0 length block is found

I preferred to check the length at the same time we check the size is OK.
I believe it's not a big deal to read a 0 length block before, as it doesn't break anything.
If it's not OK for you, this can be changed by having a extra if block for length == 0 with debug/setValid/return inside; before actually reading the block.
Comment 2 arnaud.bienner 2012-07-24 17:26:20 UTC
Created attachment 72731 [details]
Faster FLAC::FilePrivate destructor

As explained before, this speed up destructor for me.
Not sure it always speed up, but I think it never slow down.
Comment 3 arnaud.bienner 2012-07-24 17:37:04 UTC
If you agree with my changes and it's easier for you, I can make a pull request on GitHub.
Comment 4 Tim De Baets 2012-09-30 09:16:07 UTC
Not sure why no one has chimed in yet. But yes, please submit a pull request on GitHub. This sounds like an issue in TagLib that should be fixed.
Comment 5 arnaud.bienner 2012-10-07 01:17:00 UTC
OK.
I've submitted a pull request, as suggested: https://github.com/taglib/taglib/pull/73 
Hope someone will have some time to review it.