Summary: | [patch] Support for ReplayGain information | ||
---|---|---|---|
Product: | [Frameworks and Libraries] taglib | Reporter: | Martijn Pieters <bugs.kde.org> |
Component: | general | Assignee: | Scott Wheeler <wheeler> |
Status: | CONFIRMED --- | ||
Severity: | wishlist | CC: | moritz-kdebugs |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Compiled Sources | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Attachments: |
Proposed patch adding ReplayGain support
Updated patch with doxygen problems fixed. Updated patch with replaygain data format changes and dummied-out MPC APE support. |
Description
Martijn Pieters
2004-07-14 00:51:27 UTC
Created attachment 6669 [details]
Proposed patch adding ReplayGain support
Hmmm, forgot to say the proper magic word for the related bugs. Related bugs: bug #56383, bug #78444 and bug #72558. Comment on attachment 6669 [details]
Proposed patch adding ReplayGain support
Oops, this patch contains small doxygen errors (how the hell did I ever think
the | pipe symbol was supposed to work?)
Created attachment 6674 [details]
Updated patch with doxygen problems fixed.
Only changes are doxygen (comment) changes; use backslash instead of a pipe
symbol to mark arguments, and remove an improper \overload argument.
Comment on attachment 6674 [details]
Updated patch with doxygen problems fixed.
The replaygain data format doesn't handle negatives correctly; no support for
the new MPC code. New patch underway.
Created attachment 6721 [details] Updated patch with replaygain data format changes and dummied-out MPC APE support. Now the ReplayGain data format is split out to a seperate class, which can be reused by the Lame MP3 info tag support (data format is the same). Also, since MPC has just been added, the patch now includes dummied out support for the APE::Tag class, which I am looking at to reuse for MP3. (see also my comments on that code at http://lists.kde.org/?l=kde-cvs&m=109018198919187). TODO: Lame support (now much easier), APEv2 support (possibly easier), MPC support (I believe the format has it's own ReplayGain data format in the header). Great work ! I'm also lobbying for replaygain (including mp3 replaygain) support by linux tools, and TagLib is where it can start to happen. As wheeler said (http://bugs.kde.org/show_bug.cgi?id=85551), it seems any change to the global Tag interface will have to wait till TagLib 2.0. It leaves some time to refine the implementation of replaygain support. I have a few comments on your patch : 1. why floats instead of double ? I have learnt that double should be used *unless* you have a reason for float, not the other way round. In fact, float might not be precise enough for conveying a 5-digit replaygain string, and it is not unconceivable to find such tags (I once read some people want super precise gain tags) 2. Returning 0.0 in the absence of replaygain tag is not a good solution. there is a small, but non-null chance that a track's replaygain could be exactly "0.00 dB" (with 2 decimal precision, it can happen) Players should handle tracks with no replaygain by applying (possibly user-supplied) default gain, and good default values are more like "-10.00 dB" than 0.00. (because most tracks replaygains lie in [-11, -3], and I'd rather hear the non-replaygained tracks muffled than deafening-loud, so the default value should be picked in the upper range of usual replaygain values) So it is important to tell the difference between "no replaygain tag" and "0.00 dB" replaygain tag. Either pick a really impossible replaygain value (-infty or such..), or supply additional functions to query the presence of each replaygain tag (IMO the best solution) 3. RVA2 or RGAD or APEv2 : I've followed several discussions about replaygain on hydrogenaudio.org, where authors of mp3gain, of the replaygain system, of lame, of foobar2000, and some vorbis developpers come to chat, and it seems they all agree on APEv2 tags. IMO, that's because it's more flexible, and homogeneous with vorbis comments, but they might have different reasons for their opinions. Anyway, you can ask for their advise on the subject, they're almost the only replaygain-with-mp3 tools authors, so their consensus is somewhat a de-facto standard. > 1. why floats instead of double ? I have learnt that double should be > used *unless* you have a reason for float, not the other way round. In > fact, float might not be precise enough for conveying a 5-digit replaygain > string, and it is not unconceivable to find such tags (I once read some > people want super precise gain tags) A single-precision float mantissa can handle up to 23 bits precision, which is more than plenty of precision for gain adjustments; the human ear can only discern volume differences of about 0.5dB, as I understand. Also note that: - VorbisGain and MP3gain both use single-precision floats to calculate the gain, so they certainly wont produce more precision. VorbisGain then only stores two digits precision, probably because of the 0.5dB human resolution. - The RVA2 tag only stores 16 bits of information (signed integer, to be divided by 512). The RGAD frame (original ReplayGain specification) and Lame headers store only 1 digit precision; 9 bits of unsinged integer to be divided by 10 when reading. > 2. Returning 0.0 in the absence of replaygain tag is not a good solution. > there is a small, but non-null chance that a track's replaygain could be > exactly "0.00 dB" (with 2 decimal precision, it can happen) Players should > handle tracks with no replaygain by applying (possibly user-supplied) > default gain, and good default values are more like "-10.00 dB" than 0.00. > (because most tracks replaygains lie in [-11, -3], and I'd rather hear the > non-replaygained tracks muffled than deafening-loud, so the default value > should be picked in the upper range of usual replaygain values) > > So it is important to tell the difference between "no replaygain tag" and > "0.00 dB" replaygain tag. > > Either pick a really impossible replaygain value (-infty or such..), or > supply additional functions to query the presence of each replaygain tag > (IMO the best solution) Good point; VorbisGain uses -1000. for 'no gain' and -1. for 'no peak'. I maybe have to adopt these, or use -inf or NaN. I see no elegant way of using anything but a floating point number. I must say that using NaN has my preference, if I can figure out how to state it as a literal. > 3. RVA2 or RGAD or APEv2 : I've followed several discussions about > replaygain on hydrogenaudio.org, where authors of mp3gain, of the > replaygain system, of lame, of foobar2000, and some vorbis developpers > come to chat, and it seems they all agree on APEv2 tags. IMO, that's > because it's more flexible, and homogeneous with vorbis comments, but they > might have different reasons for their opinions. Anyway, you can ask for > their advise on the subject, they're almost the only replaygain-with-mp3 > tools authors, so their consensus is somewhat a de-facto standard. I was planning on asking them. My current plans are: - Read from APEv2, ID3v2 RVA2, ID3v2 RGAD, Lame (in that order of preference) - Write to: APEv2, ID3v2 RVA2, ID3v2 RGAD, if any of those are present. If none of these are present, add a RVA2 frame. Implementors can influence that policy by ensuring that certain tag formats are present in the MPEG file. Hmm, with my tired head I hadn't actually finished my 'Why a float' argument: if the various formats don't support the precision, and the human ear doesn't need the precision, why use a double? The extra information will get lost when written out to a file, and will make no difference anyway. CVS commit by mpieters: First cut at ReplayGain support in taglib. THIS IS NOT YET COMPLETE. See bug #85134; development will now continue in this branch instead of through patches attached to the bug. CCMAIL: 85134@bugs.kde.org A mpeg/replaygaindataformat.cpp 1.1.2.1 [LGPL (v2.1)] A mpeg/replaygaindataformat.h 1.1.2.1 [LGPL (v2.1)] A mpeg/id3v2/frames/replaygainframe.cpp 1.1.2.1 [LGPL (v2.1)] A mpeg/id3v2/frames/replaygainframe.h 1.1.2.1 [LGPL (v2.1)] M +17 -1 tag.cpp 1.9.6.1 M +48 -0 tag.h 1.10.6.1 M +48 -0 bindings/c/tag_c.cpp 1.8.2.1 M +45 -0 bindings/c/tag_c.h 1.7.2.1 M +6 -0 examples/tagreader.cpp 1.7.6.1 M +4 -0 examples/tagreader_c.c 1.3.6.1 [POSSIBLY UNSAFE: printf] M +24 -0 examples/tagwriter.cpp 1.1.6.1 M +78 -0 flac/flactag.h 1.1.6.1 M +44 -0 mpc/apetag.cpp 1.4.2.1 M +8 -0 mpc/apetag.h 1.3.2.1 M +2 -2 mpeg/Makefile.am 1.6.6.1 M +78 -0 mpeg/mpegfile.cpp 1.43.2.1 M +44 -0 mpeg/id3v1/id3v1tag.cpp 1.23.4.1 M +8 -0 mpeg/id3v1/id3v1tag.h 1.14.4.1 M +16 -5 mpeg/id3v2/id3v2frame.cpp 1.24.4.1 M +10 -0 mpeg/id3v2/id3v2frame.h 1.22.4.1 M +4 -0 mpeg/id3v2/id3v2framefactory.cpp 1.24.2.1 M +89 -0 mpeg/id3v2/id3v2tag.cpp 1.45.2.1 M +8 -0 mpeg/id3v2/id3v2tag.h 1.29.2.1 M +3 -0 mpeg/id3v2/frames/Makefile.am 1.11.2.1 M +83 -0 ogg/xiphcomment.cpp 1.14.2.1 M +8 -0 ogg/xiphcomment.h 1.14.2.1 M +49 -1 tests/toolkit-test.cpp 1.13.2.1 M +111 -0 toolkit/tbytevector.cpp 1.45.2.1 M +22 -0 toolkit/tbytevector.h 1.33.2.1 M +62 -1 toolkit/tstring.cpp 1.46.2.1 M +14 -0 toolkit/tstring.h 1.36.2.1 -------- Hmm, with my tired head I hadn't actually finished my 'Why a float' argument: if the various formats don't support the precision, and the human ear doesn't need the precision, why use a double? The extra information will get lost when written out to a file, and will make no difference anyway. --------- the string formats do support more precision, eg "-0.12345dB". Granted, mp3gain only writes 2 digits, but foobar writes more, and a user can very well store his own value if he wants to. So the situation is : - use double : no benefit, no drawback - use float : no benefit, no drawback in usages we can think of. In such a situation I pick double, as I said it's the usual choice when both can be used with no apparent preference. Well, the way I see it, the question is not "what's the benefit of double in this case ?" but rather "what's the benefit of float ?" none. If speed is a matter, getting a double RG value doesnt mean you can't then multiply the PCM samples with float precision.. BTW, in case of 24bit audio (eg hi-quality FLAC), you can't express bit-precise rescale factors with a float. It might not make any audible differences, but it's still better if we can provide it, with no drawback. But there *is* a drawback for using double: not all TagLib formats support that precision. So, upon writing out the value to the files, precision will be lost. I don't see how it's a drawback that double beats the precision possible in some formats. If the internal values are more precise than what will eventually be stored in file, fine, where's the problem ? If the API uses less precision than needed, it introduces an unjustified limit. If it uses more : it's only a waste of 4 bytes in a function call. What would be a drawback is using less precision in the API than possible on some formats, not the other way round. Though, I agree float or double doesnt make any difference in the expected usage. I just think it would be sad to arbitrarily limit precision at the API level, while someone might need more (the only examples I can see are reaching bit-precise rescaling of 24bits files.. and I don't see why anyone would use that. but there's a lot of audio fanatics out there, with some weird requirements sometimes) And even if we can't really consider choosing double brings any advantage, it's the type to choose unless there are reasons for single precision. Since it's still time to change the interface to double, I just want to make sure float won't be graved in stone without even considering double. In the original bug report you said: >I'd prefer to use the ID3v2 RVA2 frames instead; but again, no tools >generate these frames yet. RVA2 frames can support the exact same >information. There is a bug in the XMMS bugzilla that proposes the exact >same frame for the information, for example. This is not true: normalize (http://www1.cs.columbia.edu/~cvaill/normalize/) writes RVA2 frames and even provides an xmms plugin that honors these frames and normalizes the files while played. I'm in the process of providing such a plugin for noatun. Sincerely, Felix Berger *** This bug has been confirmed by popular vote. *** Is TagLib actually still supported? You mean maintained? Yes, though somewhat less actively than would be ideal. ;-) Whatever happened with ReplayGain support? Was it abandoned? Or is there somewhere in the SVN tree that I can find the last work that was done on it? Amarok 2.1 now has built-in replaygain support - I do not know whether that implies that taglib has been updated? If not, maybe the code can be taken from there? |