Bug 131353 - ATF can corrupt ogg files
Summary: ATF can corrupt ogg files
Status: RESOLVED FIXED
Alias: None
Product: amarok
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Debian stable Linux
: NOR critical
Target Milestone: ---
Assignee: Jeff Mitchell
URL:
Keywords:
: 131941 (view as bug list)
Depends on:
Blocks:
 
Reported: 2006-07-25 17:04 UTC by Marcel Dischinger
Modified: 2006-08-31 21:44 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Relevant ATF Ogg-handling code (1.36 KB, text/plain)
2006-08-10 15:08 UTC, Jeff Mitchell
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Marcel Dischinger 2006-07-25 17:04:57 UTC
Version:           1.4-svn (using KDE KDE 3.5.0)
Installed from:    Debian stable Packages
Compiler:          gcc (GCC) 3.3.5 (Debian 1:3.3.5-13) 
OS:                Linux

This might be a corner case, but it should be investigated to prevent amarok from doing harmful things.

I used the latest SVN version from 2006-07-24. Due to a bug amarok performed the ATF scan after starting up although it was disabled in the config. As I do not want to use this feature at the moment, I closed amarok immediately again without waiting for the collection scan and ATF additions to finish.
On another start amarok even crashed while scanning the collection and adding ATF to the collection's files (the crash wash due to another failure you do not need to bother about, as it was caused by myself).

Now I ended up with a couple of files, surprisingly all of them ogg files, being corrupted - which means they are not playable by any application I have here, also tag editors cannot read the header anymore. Also amarok adds these files to the collection but cannot play them. I believe the corrupted files are those ATF worked on just before I closed amarok or amarok crashed respectively.

As I believe this is caused by my "irregular" behavior, amarok should be much more paranoid when writing to a user's files when doing such changes. Especially it should never start changing files without prompting the user to avoid any harm (as it is probably infeasible to first make a working copy which is written back on success).
I also think that looking again into the code and think about what could go wrong to prevent any damage would be good.

However, ATF is a innovative feature, great work.

P.S. If you need to see such a file, I can mail it to you as I should not post it here due to legal issues.
Comment 1 Marcel Dischinger 2006-07-26 08:30:45 UTC
Just to add, there are about 60 ogg files broken at my side. That's really bad and so far I have not found a way to repair them as I believe it's just the header messed up by amarok.
Comment 2 Thomas Stuart 2006-08-08 23:56:46 UTC
Yes, I have exactly the same problem, and many more ogg files broken. If you find a way to repair the files, please let me know.
Comment 3 Marcel Dischinger 2006-08-10 07:41:18 UTC
I am not entirely sure what is damaged, it's a little mixed-up. 
vorbiscomment works and seems to provide the correct info. 
ogginfo will say, that it cannot decode the vorbis header packet.
file will tell the right things (like bitrate etc)
I cannot load the file into tagtool
So far I have not found a player that will play these files - I guess this is mostly caused by the usage of the xiph libraries, so they all use the same lib to decode.

I am no expert in these things, and just naively replacing or cutting of header parts does not work.
I guess you would need some more insights into the ogg/vorbis codec to fix this or have a clear idea what went wrong.
Another chance would be to find a player that would still play these corrupted files, thus you can reencode them.

I have backups for all of my corrupted files, so I will not spend more time trying to fix the broken files. Sorry.
Comment 4 Jeff Mitchell 2006-08-10 15:06:32 UTC
There are a bunch of things I don't understand about this bug.  Read on, and please understand that I'm not implying you don't have corrupted OGG files, I just don't currently see how ATF could have caused it.  If you did actually have ATF writing enabled, and it was the cause of your corrupted files, you probably are seeing a TagLib bug -- so both of you please write back and post the version of TagLib you were using at the time you saw these problems.  Thomas, were you using Debian Stable as well?

You say that "Due to a bug amarok performed the ATF scan after starting up although it was disabled in the config"...the only way this can happen is if you enabled ATF for the first time (at least, the first time since first-time detection was added :-) ), and then during the scan you crashed.  At which point when you start up again it will restart the collection scan.  If as you say ATF was disabled in the config, what most likely happened is that there was a database version bump between when you SVN-upped last time and this time, which always causes a rescan on startup.

Also, you claim that during the scan, even though ATF was disabled in your config, it wrote tags anyways.  I have a hard time believing this.  I'll attach the relevant code sections in a minute, but you'll see that both write operations that can be performed are checked by AmarokConfig::advancedTagFeatures() which is a run-time check to see if ATF is turned on -- which is controlled by the checkbox in the Config dialog.  If you want to verify this, change the value of the checkbox, quit amarok (to make sure the settings are saved to amarokrc instead of cached), and look in your ~/.kde/share/apps/amarok/amarokrc for a value called AdvancedTagFeatures.  When the box is checked, it should be set to True; when the box is not checked it should be set to False or not even exist in the amarokrc file.

Check the attached code.  You'll see that the removeField call to the tag and the addField call, as well as the call to save(), are protected by conditions that include AmarokConfig::advancedTagFeatures().  I've yet to see this check fail somehow -- so either ATF was truly off and your troubles lie elsewhere, or ATF was actually on and the trouble probably lies with TagLib.

My hunch is that you're using TagLib 1.3.1.  We'll find out if I'm right.
Comment 5 Jeff Mitchell 2006-08-10 15:08:18 UTC
Created attachment 17331 [details]
Relevant ATF Ogg-handling code
Comment 6 Thomas Stuart 2006-08-10 20:13:44 UTC
I was using SuSE 10.0. I had missed the part in the bug report about ATF being disabled; it was enabled for me, but I did the same as Marcel (quit half-way through the tag-writing process).

I've since replaced my SuSE installation with Kubuntu dapper, so I don't have a quick way of checking which version of taglib I was using. A quick glance at the rpm list at http://www.tuxmachines.org/node/2943 suggests that the Taglib version was 1.4, though.

Would it help to provide a copy of an affected file?

Would it help to upload one of the affected files?
Comment 7 Martin Aumueller 2006-08-10 20:36:50 UTC
#6: It was version 1.4 of taglib, that's for sure, as Amarok 1.4* does not even compile with 1.3.
Comment 8 Jeff Mitchell 2006-08-10 22:49:45 UTC
Really?  I was hoping that was the issue, as Debian Stable uses 1.3.1, which would have been a likely candidate for being the problem.

Regardless, I'm going to bring in Scott Wheeler on this one, unless anyone thinks that the code I pasted above is at fault.  It seems like maybe writing the XiphComments or the tags themselves on Oggs have an issue.
Comment 9 Marcel Dischinger 2006-08-11 07:28:16 UTC
Same here, Taglib 1.4 (from backports.org, 1.3 is not supported by amarok, thus it will not compile as Thomas already said).

I think the problem with the automated scan and adding ATF (I clearly have ATF tags in the corrupted and not corrupted (yes, they exist!) files) was a regression in one specific SVN revision, never appeared again since then.
Funnily, I have not found a single mp3 (which the majority of my music is encoded in) with ATF tag yet.
Comment 10 Scott Wheeler 2006-08-11 11:21:52 UTC
For me seeing one of the corrupted files would be very useful.  Also knowing if there are any interesting platform details (64-bit or PPC or whatever, gcc versions) would be interesting.

There just haven't been any significant bugs reported against the Ogg code since like 1.1 or so (and it hasn't really changed either) -- so platform-ish things could be interesting to see if something's triggering a bug somewhere.
Comment 11 Thomas Stuart 2006-08-11 12:13:29 UTC
I've uploaded an example file here:

http://therye.org/users/mortice/test/example.ogg

I'm on a plain old Athlon XP setup - no 64-bit stuff here. I'm afraid, as I said earlier, I have no way of knowing for sure which versions of software I was running on SuSE, so I don't know which gcc version it was.
Comment 12 Jeff Mitchell 2006-08-11 16:20:17 UTC
vorbiscomment shows all the tags on the corrupted oggs just fine; ogginfo blows its chunks
Comment 13 Scott Wheeler 2006-08-11 16:22:23 UTC
If you happen to have a non-corrupted version of the same file that would be wonderful.  Then I could do a byte-to-byte comparison to see what changed, which would make the corruption easier to spot.
Comment 14 Tory Harmon 2006-08-11 16:43:39 UTC
(FYI: I'm posting this as my usual net alias bleaked, but the people who helped me on IRC know me as 'excitatory' -- I apologize for any confusion)

Yes, I too have experienced this bug.  I move to speed up the confirmation process, since by the looks of it, I've been hit the hardest and I truly do not want this to happen to anyone else.  Easily 1/4 of my music collection is corrupted and unplayable by any player I've tried.  (Like a few others requested, if there's a fix for these files, do pass along the information.)

I can provide an example of a corrupted file on a per-request basis.  (I'll post it publicly if no one thinks there will be any legal repercussions)

I can attest to pretty much everything the original poster mentioned.

A few things I've noticed:

* Corruption always happens per-directory (in my case, entire albums).  In other words, it's not random individual files, always the entire directory.

* So far, I've only noticed this behavior with ogg files.  I'm currently looking for any flac or mp3s that might be affected, but thus far have not found any.

* All tags and filenames are intact.  It appears that the file sizes are the same, but I really cannot make that claim since I have nothing for comparison.
--> I will note though, that even though amarok is able to read the tags, xmms would not.  ..but, kaffeine is able to.


I think I should note that around the same time I noticed this problem (roughly a week ago, Aug 02-05, somewhere in there) I had also re-organized my music.  All of my file names were space-less, with underscores, and for purely aesthetic reasons, I went back to spaces.  So I had amarok rename all of my files through its file organizing feature.  Maybe this is related?

And finally, here is my specific, hopefully not superfluous info:

-Linux 2.6.15-23-686 running KDE 3.5.2, CPU: Intel(R) Pentium(R) 4
-Kubuntu Dapper 6.06
-kdelibs 4:3.5.2
-kdebase 4:3.5.2
-kdemultimedia 4:3.5.2
-taglib 1.4-3
-gcc 4.0.3
-compiled using the amarok-svn script
-Am I missing anything?

I hope this helps.

.:bleaked
Comment 15 Scott Wheeler 2006-08-11 17:36:50 UTC
There's really not any "speeding up of the confirmation process" -- the bug has to be fixed and for that I have to figure out what's going on (also pulling this one over to TagLib for now).

The thing that would be most useful right now is a "before and after" set of files as I mentioned above.  You can email them to me personally if you don't want to post them online.
Comment 16 Jeff Mitchell 2006-08-11 17:40:42 UTC
I've found what is definitely *a* problem, and may be *the* problem.

Look at a hexedit of the corrupted file vs. a non-corrupted file.  I'm going to reference the example.ogg posted above.

Look at byte 0x8e (just after libVorbis I 20030909).  This is a value that should indicate how many comments are coming up before the actual vorbis data.  In a proper file this value corresponds to the number of key=value pairs.  In example.ogg, you can see how that number is 10, yet there are 13 key=value pairs.  The reason for this is that if you look at byte 0x1c8, you'll see the values "cer".  This should *not* be in existence there; it is a clear error.  It seems to correspond to the end of "Juicer" from the DESCRIPTION tag.  After that, the TITLE, TRACKNUMBER, and TRACKTOTAL tags are repeated -- they had just been defined before the "cer".  That is why you see three extra key=value pairs.  The odd thing is that the "cer" seems to come from the DESCRIPTION tag, but in tbetween that tag and TITLE is GENRE, which is *not* repeated.

Scott, unless my code up there seems in error, this looks like a TagLib bug.  I'm assigning it to you and will be glad to help fix it.  I'm going to see if I can unbork this file manually; it depends what part of the header the CRC covers, so it may or may not work.  I'll report back.
Comment 17 Tory Harmon 2006-08-11 17:46:33 UTC
(Sorry, I'm new to the bug system -- at least working with it.. and hence my comment on speeding it up.)

As for some examples,
Here you go:

http://subfluous.net/test/corrupted_example.ogg
http://subfluous.net/test/uncorrupted_example.ogg

Be nice to my box!


Comment 18 Tory Harmon 2006-08-12 03:22:42 UTC
I was incorrect in saying that this bug affects everything in the directory, for I have found an album where only one song is corrupted.  Damn this is evil.

I'm wondering more and more if this is something I did or something, since this is not happening to very many people.. at least ostensibly.
Comment 19 Jeff Mitchell 2006-08-12 06:24:21 UTC
It is evil indeed.  At the moment Scott has been able to replicate it; so far I haven't.  We're still looking at it, and likely we'll hold back the release of Amarok 1.4.2 a bit to see if we can get this sorted out.  Be patient, we're trying, it's just extremely evil and hard to figure out...  :-)
Comment 20 Marcel Dischinger 2006-08-13 12:35:03 UTC
I have a couple of albums where just a few songs are affected

Jeff: It seems that just the header size is wrong set in the header and breaks the whole ogg file. As you seem to know the ogg format, is there a way to fix the ogg file by correcting this value in the hex editor and if, could you give a short description how to do it (where to find the correct byte to change and how to find out to which value)? This would be really great!

Seem we are getting somewhere here, I agree that it looks like a taglib bug
Comment 21 Mark Kretschmann 2006-08-13 12:53:04 UTC
Having knowledge about the nature of the corruption, it might be possible to 
write a small "fixer" script in Ruby that automates this.
Comment 22 Scott Wheeler 2006-08-13 15:15:20 UTC
It probably won't be that easy.  After looking at a quick binary diff of the files it looks like the ogg page count is being redone, so the script would need to know how Ogg works.

At some point this evening I'll spend a little more time seeing if I can figure out what's going on.  Once what's happening is a bit better understood it should be possible to say whether a fixer can be made...
Comment 23 Marcel Dischinger 2006-08-13 17:21:15 UTC
ogg also has error correction code.
I looked closer into this, the corruption seems, that the comment header is finished (given the size field in front of it), then there are some bad bytes (I do not know what they are for) and then I see more valid headers, duplicates of already existing ones.

A fix would be:
Remove the bad bytes, remove the additional headers.
Fix the size of the ogg frame in the ogg header and recalculate the CRC for the ogg frame.

I currently try something similar, but simpler: I manually move the bad bytes to one of the comment fields, thus the off frame stays the same size. I fix all the length fields in the vorbis comment header to match the new situation.
Finally, I need to adjust the CRC. I am stucked here at the moment, as I fail to calculate the correct CRC. I am not sure what's wrong. Maybe using the functions from libogg would fix my problems.
Comment 24 Clayton Spicer 2006-08-13 21:38:27 UTC
Just to say, i've had this happen to lots of mp3s (rather than oggs).. So maybe not an ogg related problem?
Comment 25 Jeff Mitchell 2006-08-13 23:21:48 UTC
Clayton -- corrupted headers, or corrupted MP3 data?  Can you post links to some MP3s, preferably before-and-after, so we can try reproducing it here?
Comment 26 Clayton Spicer 2006-08-14 02:17:51 UTC
don't have any webspace at the moment - if i can find somewhere to put it up tomrrow i will asap.  As far as i can tell it's data corruption - well, the music is distorted/skips/pops et c. Don't know if i can find a before and after but i'll do my best..
Comment 27 Jeff Mitchell 2006-08-14 02:45:18 UTC
Thanks, that would be extremely helpful.  Just for a link, this seems to be related to Bug 131941.
Comment 28 Jeff Mitchell 2006-08-14 02:50:22 UTC
Also, for those with MP3 problems, can you please comment on if you've noticed this happening on files of various bitrates, or specific ranges?
Comment 29 Clayton Spicer 2006-08-14 21:43:13 UTC
Ok guys, a) http://ezil.sf.net/tagtest/ (thank you marc!) 'Old' is pre-atf, from my ipod, put there by amaroK - not sure if any processing that might change the files happens on ipod transfer, but i don't think so - 'New' is post-atf.  Hope this helps.  b) Nope, afraid i've found 192kbpses and 128s affected equally. :(
Comment 30 Jeff Mitchell 2006-08-15 02:57:50 UTC
Clayton -- thanks for the files.  I will be looking at them.  Is this repeatable on your system?  i.e. can you take (a copy of) the old file, tag it, and have it turn into a corrupted one?

You can easily do this (without playing around with rescanning) by using the dcop call
dcop amarok collection newUniqueIdForFile <full path>
(note that you need to use a full path, not a relative one, i.e. /home/jeff/example.mp3, not ./example.mp3)

You'll have to have ATF tag writing enabled (checkbox checked in the config) for it to actually do the assignment, even through the dcop call.
Comment 31 Jeff Mitchell 2006-08-15 02:58:48 UTC
Also, to anyone on this bug report, I am on IRC much of the day and various times at night.  I'd appreciate it if any of you guys are on IRC to drop me a line when I'm on (nick is jefferai), as this will make it easier and faster to work on this.
Comment 32 Jeff Mitchell 2006-08-15 20:15:20 UTC
I've got a task for everyone that has broken *MP3* files.

Take some of the "broken" files, and open them in a hex editor (I recommend khexedit, which is very good).  At the very top you should see ID3, which is the start of the ID3v2 tag, then a bunch of 00 bytes, which are padding.  Then, if it's a broken file, you may see another ID3.

Three of the three broken MP3s I've seen so far contain two subsequent copies of the ID3v2 tag, which needless to say is very odd.  And I have verified that two of the three did not have any ID3v2 tags beforehand.

Can those of you with messed up MP3 files look at, say, ten of your messed up MP3 files, and ten non-messed-up-but-ATF-tagged MP3 files, and tell me

a) How many of each contain two subsequent ID3v2 tags?
b) If you know the answer to this, whether those files had only ID3v1 tags beforehand?

Thanks!
Comment 33 Clayton Spicer 2006-08-15 22:54:07 UTC
well - as far as it seems, for the two i put up for example, the pre-atf copy has no initial ID3 line, and the post-atf has exactly one.  Will examine more later.  Also btw jeff, what server for irc?
Comment 34 Jeff Mitchell 2006-08-16 00:47:29 UTC
With some help from Martin Aumueller, we think we know what's going on.  In fact, it's probably safe to use the ATF tag writing now, due to some changes in SVN a week or two ago, but we'll probably be doing some tweaking in TagLib as well.  We'll keep you posted after some more discussion.

(btw, IRC is freenode)
Comment 35 Scott Wheeler 2006-08-16 03:00:13 UTC
We've sorted this out as an amarok issue.  In some cases files were being written to multiple times simultaneously.  This happened to affect Oggs more since they have to have a lot of data moved around.
Comment 36 Tory Harmon 2006-08-16 03:07:21 UTC
Hooray!

..but is there a fix to gigs of music that many of us lost?

Either way, this is wonderful news, and I would like to thank all who have contributed to fixing this nasty bug.  Your tireless work is much appreciated.
Comment 37 Scott Wheeler 2006-08-16 03:38:41 UTC
Sadly because of the way that it was lost it's not possible.  It's neither systematic in the way that it's been corrupted and there's data that's just not there anymore.

For those that care about the details it's something like this:

Ogg Vorbis files have their tag content at the beginning.  When you update them, you have to make room at the beginning of the files for the tag data.  This involved copying everything else back a bit (since it's not possible to insert into the middle of files).  Amarok, with the beta version of ATF sometimes allowed multiple threads to write to the same file, interleaving their "making room" changes (since they're done in as I recall 2 kb blocks).

Unfortunately that means that in some cases there are missing blocks, and in other places duplicated, rather than something simpler like a corrupted header or a misplaced byte or three.
Comment 38 Ian Monroe 2006-08-16 06:07:54 UTC
And to clarify for other amarok devs, its actually seperate processes (we think amarokcollectionscanner and amarokapp would write at the same time, or if a collectionscanner continued after amarokapp crashed). 

Jeff is looking into possible locking solutions.
Comment 39 Jeff Mitchell 2006-08-16 06:22:12 UTC
For further clarification, there were a few race conditions that could occur between amarokcollectionscanner and something else, where that something else was either another instance of amarokcollectionscanner (if one was running, amarok crashed, and amarok was reopened -- the most likely cause), or certain other behaviors, for instance dragging items to the playlist -- this would be extremely unlikely but could happen.

I had incorrectly assumed taglib safely handled writes to the same file, and the author of taglib had incorrectly thought that file write access behaved the same on UNIX as several other platforms (which automatically allow only one open file handle to be writing at once), so that writes were safe by default.

After much discussion and some research we agreed that this is best handled in the application and not in taglib, since handling it in taglib would require all sorts of special cases for network filesystems and the like that really don't belong in the library.  So I get to fix this, and Scott gets to write a warning :-)

I'm going to reassign this to amarok devs and keep it open so that the code I write to fix it will correlate with this bug, as it's not technically fixed yet.

I'm sorry for everyone whose files have been affected.  Rest assured we do our absolute best to prevent this kind of situation, but bugs will be bugs, and sometimes they're worse than others.  We really appreciate all of you running SVN and our betas to help us find this kind of issue before they affect the user community at large.
Comment 40 Jeff Mitchell 2006-08-16 06:24:01 UTC
*** Bug 131941 has been marked as a duplicate of this bug. ***
Comment 41 Tory Harmon 2006-08-16 06:35:21 UTC
Again, no worries.. as you say, a bug is a bug.

I must then pose the question.. what version is safe to use?  Should I revert back to a 1.4.1 release, or compile a fresh svn?  Does it matter if I have ATF on, or not?  ..or should I just be patient and wait.  (I don't care either way, I just don't want any more files being corrupted.  ..and considering that this is an amarok bug, I worry that the current svn I'm running now is even risky..  am I wrong?)

Comment 42 Ian Monroe 2006-08-16 06:56:39 UTC
Set atf to off in ~/.kde/share/config/amarokrc and go to 1.4.1 or use the newest SVN with atf set to off (though ATF is already safe enough really, but some odd corner cases are possible).
Comment 43 richlv 2006-08-16 11:48:30 UTC
not sure how feasible this is, so i am just adding this here.

amarok startup wrapper currently checks for running amarokapp and simply brings it up (or puts it in tray) if it already exists.

i usually checked that nothing with "amar" in process name is running, especially when amarok crashed - so that seems to be the main reason why i haven't seen a single corruption even though i enabled atf pretty early.

maybe startup wrapper could also check for running collectionscanner - and possibly just die - to reduce the possibility of anything similar happening ?

i sure am adding such a check to my amarok starter script now ;)
Comment 44 Jeff Mitchell 2006-08-16 14:12:47 UTC
Amarok is safe to use -- simply go to the Options, Collection section and uncheck the "Enable ATF tag writing" box.  You won't have any risk of running into this problem without that checked.

Due to changes made about two weeks ago in SVN, you're very unlikely to run into this bug at all now except in a corner case where amarokcollectionscanner is running, amarokapp crashes, and you start up again -- and even then only with certain conditions holding true on startup.

Part of the fix for this will definitely be ensuring that no two collectionscanner processes are running at once, either through a lockfile or through selective killing, or both, along with checks to test for failure and not run a new scanner if that is the case.  We're working on the details, but if we can't get the fixes in for 1.4.2 we'll disable ATF until 1.4.3.  In fact, we may do that anyways because there's a separate bug in TagLib that can cause (metadata-only) corruption of MP3 files containing certain types of frames (does not affect song data however).  This has been fixed in TagLib SVN, and the next release of TagLib is probably not too far away, most likely before Amarok 1.4.3.  Although you can theoretically run into this problem simply by using the tag editor or in-playlist editing to update your tags, the mass-tagging behavior of ATF means you'll likely hit it when otherwise you wouldn't.

So, just ensure that you uncheck the checkbox and you'll be fine.
Comment 45 Clayton Spicer 2006-08-16 21:42:09 UTC
So, although oggs can't be repaired, do you guys think there might be any hope for mp3s? :(
Comment 46 Scott Wheeler 2006-08-16 21:45:54 UTC
Nope, same deal really.  It just happens that MP3s will still play even when
they're pretty heavily corrupted.
Comment 47 Jeff Mitchell 2006-08-17 06:26:32 UTC
Forgot to CCBUG: my commit.  Initial work on fix can be found in revision r573651.  WebSVN is down or I'd link.
Comment 48 Jeff Mitchell 2006-08-17 07:41:17 UTC
SVN commit 573760 by mitchell:

More work on the fix.  The algorithm for safe saving is pretty much there (though it's not used everywhere yet), however attempting to 
stress-test it with scanner+dcop causes scanner to freeze.  Will have to investigate tomorrow.
CCBUG: 131353


 M  +4 -2      app.cpp  
 M  +68 -12    metabundle.cpp  
 M  +4 -0      metabundle.h  
 M  +10 -12    scancontroller.cpp  
 M  +4 -1      scancontroller.h  


--- trunk/extragear/multimedia/amarok/src/app.cpp #573759:573760
@@ -621,7 +621,9 @@
     }
 
     // If just turned ATF on, clear the playlist and force a rescan
-    if ( AmarokConfig::advancedTagFeatures() && !AmarokConfig::aTFFirstTurnedOn() )
+    // comment out for now until more testing is done, this seems to occur repeatedly
+    // even after scans, also seems to be doing incremental scans...odd
+    /*if ( AmarokConfig::advancedTagFeatures() && !AmarokConfig::aTFFirstTurnedOn() )
     {
         amaroK::StatusBar::instance()->longMessageThreadSafe(
                     i18n("ATF tagging was just enabled for the first time.\n"
@@ -641,7 +643,7 @@
                           "had ATF tagging enabled will need to be\n"
                           "rescanned for full functionality."), KDE::StatusBar::Information );
         AmarokConfig::setATFJustTurnedOn( false );
-    }
+    }*/
 
     //if ( !firstTime )
         // Bizarrely and ironically calling this causes crashes for
--- trunk/extragear/multimedia/amarok/src/metabundle.cpp #573759:573760
@@ -19,6 +19,7 @@
 #include "amarokconfig.h"
 #include "debug.h"
 #include "collectiondb.h"
+#include "scancontroller.h"
 #include <kapplication.h>
 #include <kfilemetainfo.h>
 #include <kio/global.h>
@@ -173,6 +174,7 @@
         , m_isValidMedia( true )
         , m_isCompilation( false )
         , m_notCompilation( false )
+        , m_safeToSave( false )
         , m_podcastBundle( 0 )
         , m_lastFmBundle( 0 )
 {
@@ -199,6 +201,7 @@
     , m_isValidMedia( false )
     , m_isCompilation( false )
     , m_notCompilation( false )
+    , m_safeToSave( false )
     , m_podcastBundle( 0 )
     , m_lastFmBundle( 0 )
 {
@@ -247,6 +250,7 @@
         , m_isValidMedia( false )
         , m_isCompilation( false )
         , m_notCompilation( false )
+        , m_safeToSave( false )
         , m_podcastBundle( 0 )
         , m_lastFmBundle( 0 )
 {
@@ -303,6 +307,7 @@
     m_isValidMedia = bundle.m_isValidMedia;
     m_isCompilation = bundle.m_isCompilation;
     m_notCompilation = bundle.m_notCompilation;
+    m_safeToSave = bundle.m_safeToSave;
 
 //    delete m_podcastBundle; why does this crash Amarok? apparently m_podcastBundle isn't always initialized.
     m_podcastBundle = 0;
@@ -1424,7 +1429,7 @@
                 if( strip )
                 {
                         file->ID3v2Tag()->setGenre( file->ID3v2Tag()->genre() );
-                        file->save( TagLib::MPEG::File::AllTags );
+                        scannerSafeSave( file );
                         return true;
                 }
 
@@ -1433,7 +1438,7 @@
                         TagLib::ByteVector( m_uniqueId.ascii(), randSize )
                         ) );
                 file->ID3v2Tag()->setGenre( file->ID3v2Tag()->genre() );
-                file->save( TagLib::MPEG::File::AllTags );
+                scannerSafeSave( file );
                 newID = true;
             }
         }
@@ -1451,12 +1456,12 @@
                         && AmarokConfig::advancedTagFeatures()
                         && TagLib::File::isWritable( file->name() ) )
             {
-                debug() << "removing our id" << endl;
                 file->tag()->removeField( QStringToTString( ourId ) );
             }
+
             if( AmarokConfig::advancedTagFeatures() && strip )
             {
-                file->save();
+                scannerSafeSave( file );
                 return true;
             }
 
@@ -1465,16 +1470,13 @@
                      !file->tag()->fieldListMap()[QStringToTString( ourId )].isEmpty() )
                    || recreate )
             {
-
                 if( AmarokConfig::advancedTagFeatures() && TagLib::File::isWritable( file->name() ) )
                 {
                     m_uniqueId = getRandomStringHelper( randSize );
-                    debug() << "adding uniqueid " << m_uniqueId << endl;
                     file->tag()->addField( QStringToTString( ourId ), QStringToTString( m_uniqueId ) );
-                    file->save();
+                    scannerSafeSave( file );
                     newID = true;
                 }
-                
             }
             else
             {
@@ -1497,7 +1499,7 @@
 
             if( AmarokConfig::advancedTagFeatures() && strip )
             {
-                file->save();
+                scannerSafeSave( file );
                 return true;
             }
             */
@@ -1508,7 +1510,7 @@
                 {
                     m_uniqueId = getRandomStringHelper( randSize );
                     file->xiphComment()->addField( QStringToTString( ourId ), QStringToTString( m_uniqueId ) );
-                    file->save();
+                    scannerSafeSave( file );
                     newID = true;
                 }
                 */
@@ -1530,7 +1532,7 @@
 
             if( AmarokConfig::advancedTagFeatures() && strip )
             {
-                file->save();
+                scannerSafeSave( file );
                 return true;
             }
             */
@@ -1541,7 +1543,7 @@
                 {
                     m_uniqueId = getRandomStringHelper( randSize );
                     file->tag()->addField( QStringToTString( ourId ), QStringToTString( m_uniqueId ) ),
-                    file->save();
+                    scannerSafeSave( file );
                     newID = true;
                 }
                 */
@@ -1558,6 +1560,60 @@
     return (recreate ? recreate && newID : !m_uniqueId.isEmpty() );
 }
 
+void
+MetaBundle::scannerSafeSave( TagLib::File* file )
+{
+    DEBUG_BLOCK
+    if( ( QString( kapp->name() ) == QString( "amarokcollectionscanner" ) ) || !ScanController::instance() )
+    {
+        file->save();
+        return;
+    }
+
+    m_safeToSave = false;
+
+    if( ScanController::instance() ) //yes check again, it can pull out from under us at any time
+    {
+        ScanController::instance()->notifyThisBundle( this );
+        ScanController::instance()->requestPause();
+    }
+    else
+    {
+        debug() << "Could not save tag for " << url().path() << " due to conflict with collectino scanner, please try again." << endl;
+        //TODO: add after string freeze for 1.4.2
+        //amaroK::StatusBar::instance()->longMessage( i18n( "Could not save tag for %1 due to conflict with"
+                                                            //"collection scanner, please try again." ).arg( url().path() ) );
+    }
+
+    int count = 0;
+
+    while( ScanController::instance() && !m_safeToSave && count < 500 ) //time out after five seconds, just in case
+    {
+        usleep( 10000 );
+        count++;
+        if( count % 100 == 0 )
+            debug() << "waitcount is " << count << endl;
+    }
+
+    if( m_safeToSave )
+    {
+        debug() << "Starting tag save" << endl;
+        file->save();
+    }
+    else
+        debug() << "Did not write tag for file " << url().path() << endl;
+
+    ScanController::instance()->notifyThisBundle( 0 );
+    ScanController::instance()->requestUnpause();
+}
+
+void
+MetaBundle::scannerAcknowledged()
+{
+    DEBUG_BLOCK
+    m_safeToSave = true;
+}
+
 bool
 MetaBundle::newUniqueId()
 {
--- trunk/extragear/multimedia/amarok/src/metabundle.h #573759:573760
@@ -284,6 +284,7 @@
     bool setUniqueId( TagLib::FileRef &fileref, bool recreate, bool strip = false );
     bool removeUniqueId();
     bool newUniqueId();
+    void scannerAcknowledged();
 
     void detach(); // for being able to apply QDeepCopy<>
 
@@ -348,6 +349,7 @@
     bool m_isValidMedia: 1;
     bool m_isCompilation: 1;
     bool m_notCompilation: 1;
+    bool m_safeToSave: 1;
 
     PodcastEpisodeBundle *m_podcastBundle;
     LastFm::Bundle *m_lastFmBundle;
@@ -370,6 +372,8 @@
     QString getRandomString( int size );
     QString getRandomStringHelper( int size );
     TagLib::ID3v2::UniqueFileIdentifierFrame *ourMP3UidFrame( TagLib::MPEG::File *file, QString ourId );
+
+    void scannerSafeSave( TagLib::File* file );
 };
 
 /// for your convenience
--- trunk/extragear/multimedia/amarok/src/scancontroller.cpp #573759:573760
@@ -64,7 +64,7 @@
     , m_hasChanged( false )
     , m_source( new QXmlInputSource() )
     , m_reader( new QXmlSimpleReader() )
-    , m_dcopConnected( false )
+    , m_waitingBundle( 0 )
 {
     DEBUG_BLOCK
 
@@ -289,11 +289,6 @@
 ScanController::requestPause()
 {
     DEBUG_BLOCK
-    if( !m_dcopConnected )
-    {
-        debug() << "Failed to request pause, dcop not connected" << endl;
-        return;
-    }
     debug() << "Attempting to pause the collection scanner..." << endl;
     DCOPRef dcopRef( "amarokcollectionscanner", "scanner" );
     dcopRef.call( "pause" );
@@ -303,11 +298,6 @@
 ScanController::requestUnpause()
 {
     DEBUG_BLOCK
-    if( !m_dcopConnected )
-    {
-        debug() << "Failed to request unpause, dcop not connected" << endl;
-        return;
-    }
     debug() << "Attempting to unpause the collection scanner..." << endl;
     DCOPRef dcopRef( "amarokcollectionscanner", "scanner" );
     dcopRef.call( "unpause" );
@@ -317,9 +307,17 @@
 ScanController::requestAcknowledged()
 {
     DEBUG_BLOCK
-    emit scannerAcknowledged();
+    if( m_waitingBundle )
+        m_waitingBundle->scannerAcknowledged();
 }
 
+void
+ScanController::notifyThisBundle( MetaBundle* bundle )
+{
+    DEBUG_BLOCK
+    m_waitingBundle = bundle;
+}
+
 bool
 ScanController::startElement( const QString&, const QString& localName, const QString&, const QXmlAttributes& attrs )
 {
--- trunk/extragear/multimedia/amarok/src/scancontroller.h #573759:573760
@@ -75,6 +75,8 @@
         bool isIncremental() const { return m_incremental; }
         bool hasChanged() const { return m_hasChanged; }
 
+        void notifyThisBundle( MetaBundle* bundle );
+
     signals:
         void scannerAcknowledged();
 
@@ -111,7 +113,8 @@
         QStringList m_crashedFiles;
 
         static ScanController* currController;
-        bool m_dcopConnected;
+
+        MetaBundle* m_waitingBundle;
 };
 
 
Comment 49 Jeff Mitchell 2006-08-17 20:07:45 UTC
SVN commit 573985 by mitchell:

Don't allow DCOP to modify UIDs if scanner is running, too much of a pain because of recursive dcop calls, would have to make
MetaBundle a QObject, thread the writes, yadda yadda.  Have it be a QString to report success or failure or to try again after
the scan.
This should fix the freeze.  Now to convert other tag-writing operations to use scannerSafeSave and see if it works as expected.

CCBUG: 131353



 M  +15 -4     amarokcore/amarokdcophandler.cpp  
 M  +2 -2      amarokcore/amarokdcophandler.h  
 M  +2 -2      amarokcore/amarokdcopiface.h  
 M  +18 -9     collectiondb.cpp  
 M  +2 -2      collectiondb.h  
 M  +4 -3      metabundle.cpp  
 M  +1 -0      metabundle.h  


--- trunk/extragear/multimedia/amarok/src/amarokcore/amarokdcophandler.cpp #573984:573985
@@ -836,6 +836,7 @@
 
     void DcopCollectionHandler::scannerAcknowledged()
     {
+        DEBUG_BLOCK
         if( ScanController::instance() )
             ScanController::instance()->requestAcknowledged();
         else
@@ -847,14 +848,24 @@
         CollectionDB::instance()->disableAutoScoring( disable );
     }
 
-    void DcopCollectionHandler::newUniqueIdForFile( const QString &path )
+    QString DcopCollectionHandler::newUniqueIdForFile( const QString &path )
     {
-        CollectionDB::instance()->newUniqueIdForFile( path );
+        if( ScanController::instance() )
+            return "Cannot run this DCOP call during a collection scan";
+        if( CollectionDB::instance()->newUniqueIdForFile( path ) )
+            return "success";
+        else
+            return "failed"; 
     }
 
-    void DcopCollectionHandler::removeUniqueIdFromFile( const QString &path )
+    QString DcopCollectionHandler::removeUniqueIdFromFile( const QString &path )
     {
-        CollectionDB::instance()->removeUniqueIdFromFile( path );
+        if( ScanController::instance() )
+            return "Cannot run this DCOP call during a collection scan";
+        if( CollectionDB::instance()->removeUniqueIdFromFile( path ) )
+            return "success";
+        else
+            return "failed"; 
     }
 
 /////////////////////////////////////////////////////////////////////////////////////
--- trunk/extragear/multimedia/amarok/src/amarokcore/amarokdcophandler.h #573984:573985
@@ -191,8 +191,8 @@
       virtual void scanCollection();
       virtual void scanCollectionChanges();
       virtual void disableAutoScoring( bool disable );
-      virtual void newUniqueIdForFile( const QString &path );
-      virtual void removeUniqueIdFromFile( const QString &path );
+      virtual QString newUniqueIdForFile( const QString &path );
+      virtual QString removeUniqueIdFromFile( const QString &path );
       virtual void scanUnpause();
       virtual void scanPause();
       virtual void scannerAcknowledged();
--- trunk/extragear/multimedia/amarok/src/amarokcore/amarokdcopiface.h #573984:573985
@@ -186,8 +186,8 @@
    virtual void scanCollection() = 0;                      ///< Scan the collection.
    virtual void scanCollectionChanges() = 0;               ///< Scan the collection for changes only.
    virtual void disableAutoScoring( bool disable ) = 0;    ///< Disable updating track stats on track change.
-   virtual void newUniqueIdForFile( const QString &path ) = 0; ///< Assign a new unique ID to a file.
-   virtual void removeUniqueIdFromFile( const QString &path ) = 0; ///< Remove our unique ID from a given file.
+   virtual QString newUniqueIdForFile( const QString &path ) = 0; ///< Assign a new unique ID to a file.
+   virtual QString removeUniqueIdFromFile( const QString &path ) = 0; ///< Remove our unique ID from a given file.
    virtual void scanPause() = 0;                           ///< Pause collection scanner.
    virtual void scanUnpause() = 0;                         ///< Unpause collection scanner.
    virtual void scannerAcknowledged() = 0;                 ///< Called by the scanner to acknowledge the request.
--- trunk/extragear/multimedia/amarok/src/collectiondb.cpp #573984:573985
@@ -3169,7 +3169,7 @@
     }
 }
 
-void
+bool
 CollectionDB::newUniqueIdForFile( const QString &path )
 {
     DEBUG_BLOCK
@@ -3178,7 +3178,7 @@
     if( !QFile::exists( path ) )
     {
         debug() << "QFile::exists returned false for " << path << endl;
-        return;
+        return false;
     }
 
     // Clean it.
@@ -3186,12 +3186,16 @@
 
     MetaBundle bundle( url );
     if( bundle.newUniqueId() )
+    {
         doATFStuff( &bundle, false );
-    else
-        debug() << "Could not set new unique id" << endl;
+        return true;
+    }
+
+    debug() << "Could not set new unique id" << endl;
+    return false;
 }
 
-void
+bool
 CollectionDB::removeUniqueIdFromFile( const QString &path )
 {
     DEBUG_BLOCK
@@ -3200,14 +3204,20 @@
     if( !QFile::exists( path ) )
     {
         debug() << "QFile::exists returned false for " << path << endl;
-        return;
+        return false;
     }
 
     url.cleanPath();
 
     MetaBundle bundle( url );
-    bundle.removeUniqueId();
-    doATFStuff( &bundle, false );
+    if( bundle.removeUniqueId() )
+    {
+        doATFStuff( &bundle, false );
+        return true;
+    }
+
+    debug() << "Cound not remove unique id" << endl;
+    return false;
 }
 
 QString
@@ -3236,7 +3246,6 @@
     QString currurl = escapeString( mpm->getRelativePath( currdeviceid, url.path() ) );
 
     bool scanning = ThreadWeaver::instance()->isJobPending( "CollectionScanner" );
-    
     QStringList uid = query( QString(
             "SELECT uniqueid "
             "FROM uniqueid%1 "
--- trunk/extragear/multimedia/amarok/src/collectiondb.h #573984:573985
@@ -313,8 +313,8 @@
         //song methods
         bool addSong( MetaBundle* bundle, const bool incremental = false );
         void doATFStuff( MetaBundle *bundle, const bool tempTables = true );
-        void newUniqueIdForFile( const QString &path );
-        void removeUniqueIdFromFile( const QString &path );
+        bool newUniqueIdForFile( const QString &path );
+        bool removeUniqueIdFromFile( const QString &path );
         QString urlFromUniqueId( const QString &id );
         QString uniqueIdFromUrl( const KURL &url );
 
--- trunk/extragear/multimedia/amarok/src/metabundle.cpp #573984:573985
@@ -1587,11 +1587,12 @@
 
     int count = 0;
 
-    while( ScanController::instance() && !m_safeToSave && count < 500 ) //time out after five seconds, just in case
+    while( ScanController::instance() && !m_safeToSave && count < 50 ) //time out after five seconds, just in case
     {
-        usleep( 10000 );
+        kapp->processEvents( 100 );
+        usleep( 100000 );
         count++;
-        if( count % 100 == 0 )
+        if( count % 10 == 0 )
             debug() << "waitcount is " << count << endl;
     }
 
--- trunk/extragear/multimedia/amarok/src/metabundle.h #573984:573985
@@ -285,6 +285,7 @@
     bool removeUniqueId();
     bool newUniqueId();
     void scannerAcknowledged();
+    bool safeToSave() { return m_safeToSave; }
 
     void detach(); // for being able to apply QDeepCopy<>
 
Comment 50 Jeff Mitchell 2006-08-17 22:25:32 UTC
SVN commit 574021 by mitchell:

This mostly completes the safety patch for ATF.  The following conditions are handled:
Collection Scanner + Playlist tag editing
Collection Scanner + Tag Dialog tag editing
Collection Scanner + DCOP UFID changing/stripping

What remains are the various Collection Scanner + Collection Scanner scenarios.

CCBUG: 131353


 M  +2 -2      collectiondb.cpp  
 M  +2 -2      collectionscanner/collectionscanner.cpp  
 M  +24 -13    metabundle.cpp  
 M  +1 -1      metabundle.h  
 M  +13 -4     scancontroller.cpp  
 M  +5 -2      scancontroller.h  


--- trunk/extragear/multimedia/amarok/src/collectiondb.cpp #574020:574021
@@ -3223,7 +3223,7 @@
 QString
 CollectionDB::urlFromUniqueId( const QString &id )
 {
-    bool scanning = ThreadWeaver::instance()->isJobPending( "CollectionScanner" );
+    bool scanning = ( ScanController::instance() && !ScanController::instance()->isPaused() );
     QStringList urls = query( QString(
             "SELECT deviceid, url "
             "FROM uniqueid%1 "
@@ -3245,7 +3245,7 @@
     int currdeviceid = mpm->getIdForUrl( url.path() );
     QString currurl = escapeString( mpm->getRelativePath( currdeviceid, url.path() ) );
 
-    bool scanning = ThreadWeaver::instance()->isJobPending( "CollectionScanner" );
+    bool scanning = ( ScanController::instance() && !ScanController::instance()->isPaused() );
     QStringList uid = query( QString(
             "SELECT uniqueid "
             "FROM uniqueid%1 "
--- trunk/extragear/multimedia/amarok/src/collectionscanner/collectionscanner.cpp #574020:574021
@@ -353,13 +353,13 @@
             kapp->processEvents();  // let DCOP through!
             if( m_pause )
             {
-                dcopRef.call( "scannerAcknowledged" );
+                dcopRef.send( "scannerAcknowledged" );
                 while( m_pause )
                 {
                     sleep( 1 );
                     kapp->processEvents();
                 }
-                dcopRef.call( "scannerAcknowledged" );
+                dcopRef.send( "scannerAcknowledged" );
             }
         }
 
--- trunk/extragear/multimedia/amarok/src/metabundle.cpp #574020:574021
@@ -1304,7 +1304,7 @@
                 if ( compilation() != CompilationUnknown )
                     setExtendedTag( f.file(), compilationTag, QString::number( compilation() ) );
             }
-            return f.save();
+            return scannerSafeSave( f.file() );
         }
     }
     return false;
@@ -1560,14 +1560,15 @@
     return (recreate ? recreate && newID : !m_uniqueId.isEmpty() );
 }
 
-void
+bool
 MetaBundle::scannerSafeSave( TagLib::File* file )
 {
     DEBUG_BLOCK
-    if( ( QString( kapp->name() ) == QString( "amarokcollectionscanner" ) ) || !ScanController::instance() )
+    if( ( QString( kapp->name() ) == QString( "amarokcollectionscanner" ) ) //default
+            || !ScanController::instance() //no scan, we're good
+            || !AmarokConfig::advancedTagFeatures() ) //if no ATF, we're not writing to files with scanner
     {
-        file->save();
-        return;
+        return file->save();
     }
 
     m_safeToSave = false;
@@ -1575,18 +1576,22 @@
     if( ScanController::instance() ) //yes check again, it can pull out from under us at any time
     {
         ScanController::instance()->notifyThisBundle( this );
-        ScanController::instance()->requestPause();
+        if( !ScanController::instance()->requestPause() )
+        {
+            debug() << "DCOP call to pause scanner failed, aborting save" << endl;
+            return false;
+        }
     }
-    else
+    else //scanner seems to have exited in the interim
     {
-        debug() << "Could not save tag for " << url().path() << " due to conflict with collectino scanner, please try again." << endl;
-        //TODO: add after string freeze for 1.4.2
-        //amaroK::StatusBar::instance()->longMessage( i18n( "Could not save tag for %1 due to conflict with"
-                                                            //"collection scanner, please try again." ).arg( url().path() ) );
+        return file->save();
     }
 
     int count = 0;
+    bool result = false;
 
+    debug() << "entering loop to wait on scanner" << endl;
+
     while( ScanController::instance() && !m_safeToSave && count < 50 ) //time out after five seconds, just in case
     {
         kapp->processEvents( 100 );
@@ -1599,13 +1604,19 @@
     if( m_safeToSave )
     {
         debug() << "Starting tag save" << endl;
-        file->save();
+        result = file->save();
+        debug() << "done, result is " << (result?"success":"failure") << endl;
     }
     else
         debug() << "Did not write tag for file " << url().path() << endl;
 
     ScanController::instance()->notifyThisBundle( 0 );
-    ScanController::instance()->requestUnpause();
+    if( !ScanController::instance()->requestUnpause() )
+    {
+        debug() << "DCOP call to unpause scanner failed, it may be hung" << endl;
+    }
+
+    return result;
 }
 
 void
--- trunk/extragear/multimedia/amarok/src/metabundle.h #574020:574021
@@ -374,7 +374,7 @@
     QString getRandomStringHelper( int size );
     TagLib::ID3v2::UniqueFileIdentifierFrame *ourMP3UidFrame( TagLib::MPEG::File *file, QString ourId );
 
-    void scannerSafeSave( TagLib::File* file );
+    bool scannerSafeSave( TagLib::File* file );
 };
 
 /// for your convenience
--- trunk/extragear/multimedia/amarok/src/scancontroller.cpp #574020:574021
@@ -65,6 +65,8 @@
     , m_source( new QXmlInputSource() )
     , m_reader( new QXmlSimpleReader() )
     , m_waitingBundle( 0 )
+    , m_isPaused( false )
+    , m_lastCommandPaused( false )
 {
     DEBUG_BLOCK
 
@@ -285,22 +287,24 @@
     m_dataMutex.unlock();
 }
 
-void
+bool
 ScanController::requestPause()
 {
     DEBUG_BLOCK
     debug() << "Attempting to pause the collection scanner..." << endl;
     DCOPRef dcopRef( "amarokcollectionscanner", "scanner" );
-    dcopRef.call( "pause" );
+    m_lastCommandPaused = true;
+    return dcopRef.send( "pause" );
 }
 
-void
+bool
 ScanController::requestUnpause()
 {
     DEBUG_BLOCK
     debug() << "Attempting to unpause the collection scanner..." << endl;
     DCOPRef dcopRef( "amarokcollectionscanner", "scanner" );
-    dcopRef.call( "unpause" );
+    m_lastCommandPaused = false;
+    return dcopRef.send( "unpause" );
 }
 
 void
@@ -309,6 +313,10 @@
     DEBUG_BLOCK
     if( m_waitingBundle )
         m_waitingBundle->scannerAcknowledged();
+    if( m_lastCommandPaused )
+        m_isPaused = true;
+    else
+        m_isPaused = false;
 }
 
 void
@@ -316,6 +324,7 @@
 {
     DEBUG_BLOCK
     m_waitingBundle = bundle;
+    debug() << "will notify " << m_waitingBundle << endl;
 }
 
 bool
--- trunk/extragear/multimedia/amarok/src/scancontroller.h #574020:574021
@@ -76,13 +76,14 @@
         bool hasChanged() const { return m_hasChanged; }
 
         void notifyThisBundle( MetaBundle* bundle );
+        bool isPaused() { return m_isPaused; }
 
     signals:
         void scannerAcknowledged();
 
     public slots:
-        void requestPause();
-        void requestUnpause();
+        bool requestPause();
+        bool requestUnpause();
         void requestAcknowledged();
 
     private slots:
@@ -115,6 +116,8 @@
         static ScanController* currController;
 
         MetaBundle* m_waitingBundle;
+        bool m_lastCommandPaused;
+        bool m_isPaused;
 };
 
 
Comment 51 Jeff Mitchell 2006-08-21 14:37:37 UTC
Folks,

I'm almost done with this.  I do need some people to help me test if possible...I'll be transcoding a bunch of files to oggs since that used to get a much higher error rate than MP3s, but the more the merrier.  If you have space on your hard drive to create a copy of some or all of your files, and can do some testing in a before-and-after fashion, i.e. do some testing now, then after I commit another fix (once 1.4.2 is out the door) do some testing on that revision, and possibly go back a bit in SVN and do some testing on an earlier revision, it would be most helpful.

Please let me know.

Thanks,
Jeff
Comment 52 richlv 2006-08-21 14:47:14 UTC
maybe you could give some scenario that should have corrupted files before ?
like, a script that rewrites tags constantly through dcop & running a collection scanner or something...
Comment 53 der Graph 2006-08-21 18:25:55 UTC
... or maybe you could even write a small app that tries writing to the files multiple times, and maybe even checks them for corruptions (calls ogginfo). I.e. if I had an application that tries to enforce the error, I'd gladly make copies of some of my OGGs and transcode as much as my space allows. But I fear I don't have the time to prepare a set of files that may be corrupted, try to get Amarok to mess them up and then check all those files for errors.
Comment 54 Jeff Mitchell 2006-08-21 18:43:04 UTC
I do know, from tests with Scott Wheeler, that a purely taglib-based app that forks off several processes that each try writing a tag and calling save on a file will corrupt it.  I also know some conditions under which Amarok could show the same behavior -- these conditions I will be testing out myself.  What I don't know are all the conditions under which the people on this bug had their files corrupted.  I believe I've found all of those cases, and fixed all but two (explained in a moment*), but as I repeatedly scanned+AFT-tagged my collection multiple times (I did this by calling dcop to strip tags, then rescanning) and never got any corruption to begin with, it's hard for me to really know if no one else has issues.

I will be testing as aggresively as I can, however, the more people that test the better.


*The two conditions are that it does not yet detect if the collectionscanner is already running, which will be fixed soon, and it can not handle the situation where multiple people, all with ATF tag writing enabled, all try to scan a collection without tags.  This can't be fixed for various reasons, chief of which is that there are no locking semantics that work across Samba or (often) NFS.  Please check the Information link in the Options next to the ATF tag writing enable checkbox for more info, once it's added back to the GUI in SVN.
Comment 55 Jeff Mitchell 2006-08-21 22:23:12 UTC
As a note:
SVN commit 575648, which went directly after the tagging of 1.4.2, should finish (pending misc. bugs) the codework to fix this problem.  Note the multi-user clause in the comment above.

I'm not going to mark the bug as resolved yet.  I'll work on defining some tests that people can run; someone was going to post a tool to the mailing list that should automate the process of checking for corrupted oggs/mp3s.
Comment 56 Krzysztof Lichota 2006-08-22 11:21:51 UTC
I am afraid these fixes will not solve all possible cases, as there is always a possibility that some other process is changing the file in the middle.

I would like to propose different solution, which should handle all cases, even if the file is changed by some other process. It is based on lock-free structures used in concurrent programming (using compare-exchange machine instruction). The idea is as follows: copy structure, modify it on side, and then, if the contents hasn't changed in the meantime, atomically replace it.

With respect to changing files on disk, it would look like that:
1. Copy ogg file to some temporary file (with different extension than .ogg) in the same directory.
2. Compute checksum of temporary file (for example Sha-1 or MD5).
3. Check correctness of temporary file (some process might be modifying the original file while we are copying it). If file is not correct - go back to step 1.
4. Modify tags in temporary file.
5. Rename original file to some temporary name (no copying, just rename, so it must be in the same directory). This name should be distinguishable as if the replacing fails for some reason (like application termination), it must be identified we have been in the middle of transaction and change must be rolled back.
6. Check if checksum of renamed file matches checksum computed in step 2. If not, rename it back and go to step 1. It would be useful here to check if this file is not opened by some other process at that time.
7. Rename temporary file to original file name (again, it is in the same directory, so it should be atomic).
8. Remove renamed file.

Steps 5-8 constitute a transaction. If any of the steps fails, all changes must be rolled back (if failure happens in steps 5-7) or transaction must be finished (if failure happens between steps 7 and 8). 
Failure can be failed operation (reading file, renaming file, etc.) or restart of Amarok (due to application crash, computer restart, power outage, etc.).

Notes: 
1. I assume noone will try to modify the temporary file when we are modifying it. Hopefully no tool will modify file without .ogg extension, even if it notices OGG header. 
2. If checking for open file descriptors in step 6 is not implemented, there is a possibility that modifications done by the other tool accessing the file will be lost and this tool will not notice it. However, the file will be still in consistent state (kindof like journalling filesystem).

HTH
Comment 57 Jeff Mitchell 2006-08-22 15:04:44 UTC
First up, I wanted to give a bit of news to those on the list -- I found the reason that some of you noticed ATF tagging without having turned the option on.  It was due to a typo coinciding with the semantics of C++ boolean operators and function names.  Essentially, the boolean check to see if ATF was turned on was accidentally referencing the check function itself, instead of the check function's return value.  This was not caught by the compiler because in C++ functions themselves can be treated as pointers; so instead of returning false because ATF was off, it returned true because the function (and therefore a pointer to it) did in fact exist.  This error was in SVN for about three days.

Regarding the previous post, there are some good ideas there, but I don't think we can do it.

Step 3 says to check the correctness of the temporary file.  We don't have a way to do this; there are certain utilities that can do this, but we can't rely on them all being present on the user's system (in fact, they generally aren't).

Another reason we've rejected various schemes we thought of is also speed.  This would be an extraordinarily slow operation.  Scanning for many people already takes hours because of collections stored on network storage (NFS and Samba).  Copying files, checksumming them, checksumming again would be extremely expensive.

I will talk with other developers and see what they have to say; but I'm inclined to think that the answer will be to use the logic that has currently been implemented in SVN, which will handle the normal single-user case -- and to severely warn users to be careful in multi-use environments.

I'll keep this updated.
Comment 58 Jeff Mitchell 2006-08-22 16:09:45 UTC
I wanted to clarify something...

I think that doing it in the atomic lock-free fashion is a good thing and I'd like to do it.  I just have to reach a consensus with other developers, some of whom think that the initial scan with ATF writing is already *way* too long.  So the consensus will have to be, do we implement this and just warn users that it will take a very long time, or do we warn them to be careful about multiple users and leave things as they currently are in SVN, which should be safe for the single-user situation.
Comment 59 Martin Aumueller 2006-08-22 16:29:09 UTC
Writing unique id tags does not have to be a part of Amarok (perhaps it even shouldn't). This could be done by an external application/script, and this application could take weeks to do it. To make it more comfortable for the user, Amarok could start this application.
Comment 60 Jeff Mitchell 2006-08-22 16:59:27 UTC
It is a part of an external application -- the amarokcollectionscanner :-|  Anyways, remember that it's only the initial scan that would take a long time...once files are tagged, scanning them is as fast as always.

Unfortunately, simply moving it to a separate app doesn't fix things...it'd still be prone to corruption if not handled appropriately.
Comment 61 Mark Kretschmann 2006-08-22 17:32:20 UTC
On Tuesday 22 August 2006 16:09, Jeff Mitchell wrote:
> I think that doing it in the atomic lock-free fashion is a good thing and
> I'd like to do it.  I just have to reach a consensus with other developers,
> some of whom think that the initial scan with ATF writing is already *way*
> too long.  So the consensus will have to be, do we implement this and just
> warn users that it will take a very long time, or do we warn them to be
> careful about multiple users and leave things as they currently are in SVN,
> which should be safe for the single-user situation.


At this point you might as well go back to calculating hashes. Takes just as 
long, but is completely safe.
Comment 62 Scott Wheeler 2006-08-22 17:49:10 UTC
Actually calculating hashes shouldn't take near as long as writing to files without ID3v2 tags or Ogg files.  You can also write a hash algorithm that's pretty darn close that only reads a small portion of the file and even if you read the whole thing that's still a lot faster than writing.

On the other hand, it will fail on files that are modified by other applications.
Comment 63 Jeff Mitchell 2006-08-22 17:58:33 UTC
That's true, they're read-only operations.  Also, the second calculation would (possibly) be somewhat cached.

Calculating hashes locally on files and storing them in the local database means in a multi-user environment (where we currently have issues) everyone needs to calculate the hashes...overall taking much more time.  Writing our own UIDs is much more portable.

Worse, if you use hashes, if a file is moved you have to get a hash of the file again to see if it's the same file.  And if the metadata has changed in the interim...

You could try only hashing on the song data, but that's a pain.  Maybe TagLib could provide an offset of a file where the actual song data starts, but it's still a bad solution.  I think hashing twice at the beginning to guarantee safety followed by a write which will speed up all relevant operations in the future, and make it safe and fast in a multi-user environment, is a good idea.
Comment 64 Jeff Mitchell 2006-08-22 22:50:28 UTC
Okay, so I've been working on implementing the algorithm posted above.

It doesn't seem as inefficient as I and others might have feared.  Point (3) is a concern, since we don't have a good way to check if the data is valid; however, this is bypassed by defaulting to the original if the checksums don't match.  Presumably whatever is writing over the original file knows what it's doing and will leave it in a fine state; we'll just have to write our tag on the next scan instead.

So far testing has been good.  I created a 300MB MP3 file using mp3wrap so that I'd have enough time to make changes to the file manually.  Making changes to the original file while the copy was being made resulted in a truncated copy with a bad MD5 sum, so the original was deferred to.  Renaming/deleting the original while the copy was being made meant that step #5 failed, and it aborts, deleting the (possibly incomplete) temporary copy.  Changing a single bit in the original file while the temporary file's checksum was being calculated caused the second checksum calculation to not match, and the original was deferred to, with the temporary file being deleted.

In short, everything is looking good.  The one thing I have yet to do is perform the actual taglib save on the temporary file, because the taglib fileref I have is invalidated as a result of the new name.  Probably I will have to generate a second fileref, and manually copy over attributes from the original file (with the metadata modified in that taglib ref's memory) to the copy, then save the copy.  Even if another taglib process is writing to the original during that time, it will be alright because of the md5 checks.

Scott -- any input on what might be a good way to do this?  Or is there some "under the hood" way I can tell the existing file reference to keep the metadata we've changed in its memory, but actually work on a different file?
Comment 65 Jeff Mitchell 2006-08-22 23:11:07 UTC
For interested parties, I didn't want to post the whole commit since it's large -- the others were large enough.  This code mostly implements the above algorithm, except for the actual metadata save:

http://websvn.kde.org/trunk/extragear/multimedia/amarok/src/metabundle.cpp?rev=576023&view=rev
Comment 66 Jeff Mitchell 2006-08-24 21:06:52 UTC
For interested parties, a lot of work was committed in revisions 576759 and 576760, including full implementation of the transactional save algorithm (some KIO slot errors to figure out, still, but it mostly works).

http://websvn.kde.org/trunk/extragear/multimedia/amarok/src/?view=rev&rev=576759
http://websvn.kde.org/trunk/extragear/multimedia/amarok/src/?view=rev&rev=576760
Comment 67 Krzysztof Lichota 2006-08-24 22:11:44 UTC
I have only skimmed through the algorithm, but I think one more thing is missing. To be able to revert renaming of original file in case of crash/power off, you should write to some persistent file storage (for example KConfig config file) that you are about to do rename. After rename you can remove the entry. And at Amarok start you should read this file and finish any operations which were in progress. This is some kind of journal, like in database transactions :)
Comment 68 Jeff Mitchell 2006-08-24 23:45:39 UTC
That's a good thought.  I can implement the KConfig saving (will do after removing the signal/slot stuff which won't work for what I'm doing).  Worse comes to worse, the temporary files have "amaroktemp" in the filename, and the original files have "amarokoriginal" in the filename, making manual recovery fairly simple (after all, it's the original filename plus extra information).

I may also add some amount of randomness to the filename, since theoretically, two computers named "localhost" that happen to run amarok with the samd PID could present a problem.  Not likely, but then neither is getting struck by lightning, and it happens.

Finishing operations that were in progress will not be possible, unfortunately.  At least not for now.  However, most operations shouldn't be too hard to resume -- either start the scanning over, or re-tag what were in the middle of tagging.

I would appreciate if you can keep your eye on this and let me know if you think that the transactional algorithm is properly implemented.  Safety is really a top priority here.

Thanks for your help!
Comment 69 Milosz Derezynski 2006-08-25 08:18:46 UTC
BTW what is with UFID frames and MusicBrainz? Do you never write the MB Track ID into ID3v2-tag containing files? According to wiki.musicbrainz.org/MusicBrainzTag, it is exactly the UFID frame that should be used for that, so i'm kinda not getting this (or are you not really fully using MusicBrainz metadata, but use it only as a "better freedb" to obtain the casual metadata like artist, album, title, year, etc?).

I've implemented something i named FFT (Fingerprint File Tracking) in BMP2, which creates a hash from the bytevector of the file's tags which currently only works with the tag types for which taglib can render a bytevector for (BTW Hey Scott, how come you didn't think of this :P):

http://bmpx.beep-media-player.org/site/FFT
Comment 70 Milosz Derezynski 2006-08-25 08:19:31 UTC
Here's a clickable link to the MB wiki page:

http://wiki.musicbrainz.org/MusicBrainzTag
Comment 71 Milosz Derezynski 2006-08-25 08:24:25 UTC
Yeah also, nevermind, i lost my mind and forgot an UFID consist of owner URL+identifier so you can have both, sorry
Comment 72 Jeff Mitchell 2006-08-25 14:24:49 UTC
Heh, I'm not sure I followed all that, but MusicBrainz uses the UFID tag to allow it to figure out what a file is, globally.  Our use is a little different...we're not caring about whether your MP3 of X by artist Y is the same file as your friend's MP3 of X by artist Y.  We're more interested in where your file's going.  You could use the MB tags for that purpose I guess, but it'd be somewhat awkward, and would require everything to be MB-tagged, and could pose a problem if you have two copies of a file with different meta-data and you want it that way (for some reason).
Comment 73 Jeff Mitchell 2006-08-25 16:40:05 UTC
Krzyzstof, KConfig isn't thread-safe, so we can't store records in there.  AmarokConfig could be protected by a mutex, but since the collection scanner is a separate process, it won't help.  So we won't be able to have that kind of record; however, if something messes up after the original file has been renamed, a quick search for amarokoriginal in the filename will reveal the file...so in the offchance this happens, which should not be often, it shouldn't be hard to recover.
Comment 74 Jeff Mitchell 2006-08-25 22:16:47 UTC
For interested parties, the transaction code is complete (revision below).  It's not yet in the gui, but that will change once I update documentation.  I've stripped tags from, and written tags to several thousand of my files.  I've also done testing where I modify single bits of files after the original's been copied, and watched it gracefully fail as it's designed to do.  If any of you would like to start testing, I'd appreciate it.  You can turn on ATF by inserting the setting

AdvancedTagFeatures=true

in the [Collection] section of ~/.kde/share/config/amarokrc.

http://websvn.kde.org/trunk/extragear/multimedia/amarok/src/?view=rev&rev=577192
Comment 75 Milosz Derezynski 2006-08-26 00:30:07 UTC
Jeff:

(quote)"You could use the MB tags for that purpose I guess, but it'd be somewhat awkward, and would require everything to be MB-tagged, and could pose a problem if you have two copies of a file with different meta-data and you want it that way (for some reason)."

OK well for one i made a mistake and missed the point that you can have several UFID frames as long as the owner is a different one.

Secondly, what you are mentioning (please don't take it personal, a lot of players do that) is what i'm calling MB-abuse. If you have 2 files that were tagged _by_ MusicBrainz and contain MusicBrainz tags (as described on the wiki page i linked to earlier, including their Xiphcomment names, or ID3v2 frames, etc), then you can not with validity alter these tags because in a manner of speaking this metadata that comes from MB is "canon": It is unique. If it should be ever altered, then on the musicbrainz server, and you'd have to update in that case, but altering _MusicBrainz Tags_ on *your own* equals to something like making modifications to some software locally and then being out of sync with the original version as provided upstream or something. So in this sense, you can not have "two files that have been tagged with MusicBrainz but have different metadata" (you can have different "casual" metadata, but not different with respect to the names as given for the additional MB tags).

A lot of players (and from what i figure Amarok too, i'm not sure?) are using MB as a better freedb to acquire metadata to fill in the casual tags, but they do not take the MB metadata for what it is: An additional set of metadata to be used along side the casual set of metadata.

All this together, it would be perfectly valid to track a file by the MB Track Id, except that you could possibly have duplicate _files_ (as in, they are the same file, but in different paths), in which case this would fail, but otherwise i think it's perfectly valid.
Comment 76 Jeff Mitchell 2006-08-26 01:11:12 UTC
I don't understand what "abuse" you're talking about.  We don't touch the MB tags.  At all.  Ever.  And if you're suggesting that if a file has been tagged by MusicBrainz then you should never, ever alter its non-MB metadata, that's just stupid.

As for tracking by MB track ID, sure you could do that, but that requires dependencies on client libraries that a lot of people don't have and/or don't want.  We've used libtunepimp in the past, which has been the cause of many problems (in fact, Gentoo recently hard masked it due to security problems and its API being too unstable).  It also dumps several tags into a file, not just one (AFAICR, as I have very few files with MB tags and haven't looked at those tags in a while).  Anyways, it wouldn't matter as far as this bug report was concerned, because if we were calling MB libraries to tag a file instead, we'd still have the possibility of these files getting tagged by two processes at the same time.

If you have something relevant to this bug, then by all means keep posting, but otherwise, let's move the discussion of MB tags elsewhere.
Comment 77 Milosz Derezynski 2006-08-26 02:19:20 UTC
Jeff:

As for libtunepimp, yes i don't use that either, it's problematic in various ways. I wrote a gstreamer element,which is still in working state, which can calculate a file PUID as well, you can find it here: 

http://projects.beep-media-player.org/index.php/Main/GstPUID

The elements still needs to be worked on, i am aware of that but in the end the benefit is that it's possible to calculate the PUID for any file type gstreamer can decode/demux (including WMA and M4A (aac in quicktime)), and not limited to i think only mp3, vorbis and mpc which libTP does).

Next thing:

What i mean with "MB abuse" is when you for example query for a particular track or a whole release or in whatever way you do that (i have a few C++ classes using the MusicBrainz XML webservice), receive metadata from MusicBrainz, but don't tag the files you have using the MB field names or whatever the corret name is for the individual tag type ("frame" or "comment name", etc), but use this data to fill in the casual tags (just 'plain' artist, etc) from it (aka "use it as a better freedb"). IMO it doesn't make much sense in using just a part of MB metadata (e.g. using the album artist sortname you wouldn't need hacks like checking if all artists for an album are unique to determine whether it's a compilation or not).

Last Point: You're right, sorry, i diverged on off topic stuff which has nothing to do with this bug anymore , i'll stop posting in here about this, apologies.
Comment 78 Krzysztof Lichota 2006-08-26 11:26:38 UTC
> I may also add some amount of randomness to the filename, since 
> theoretically, two computers named "localhost" that happen to run 
> amarok with the samd PID could present a problem.  Not likely, 
> but then neither is getting struck by lightning, and it happens. 

I think you should even create temporary files names using mktemp() or similar to prevent symlink attacks.
 
Comment 79 Krzysztof Lichota 2006-08-26 12:09:08 UTC
I have reviewed the code. Forget my previous comment - you are using 8-character random marker, so it is OK.

Here are my notes. Forgive me if I understood something wrong :)

MetaBundleSaver::prepareToSave()
- gethostname() does not terminate too long strings with NULL. You have to add \0 at the end manually to prevent reading beyond buffer.
- MD5 sum can be calculated while copying file to temporary location. This will save one pass of reading the file.
- It seems temporary file is not removed when calculating MD5 fails or any other failure happens (unless it is done at higher level). 

MetaBundleSaver::doSave()
- IMO in fail_remove_copy you should rename original file back to original filename even if removing temporary file fails.
- looks like revert=true should be set right after renaming original file. Currently it is possible to leave file renamed if calculating MD5 of renamed file fails and later.

MetaBundleSaver::cleanupSave()
Nothing here.

BTW. You could get rid of a lot of "delete something" trickery if you used something similar to boost::scoped_ptr.
Comment 80 Jeff Mitchell 2006-08-26 21:18:26 UTC
I've put updates in http://websvn.kde.org/trunk/extragear/multimedia/amarok/src/?view=rev&rev=577465

MetaBundleSaver::prepareToSave():
-gethostname() has been fixed; the 32nd character is now explicitly set to a null character. Thanks for catching that.
-MD5 sum is now calculated while copying the file as you suggested.
-The temporary file is removed in cleanupSave(), if not earlier, which (after these changes) is called in the destructor if it's not been run manually before.

MetaBundleSaver::doSave():
-fail_remove_copy will now not return even if removing the temporary file fails, so that if the original file needs to be reverted it is still done.
-revert is now set to true immediately after renaming the original file.

I think this takes care of all the issues you saw.  Thanks for all the help (and if you see anything else please let me know).  Would you feel comfortable using this now?  :-)

Comment 81 Jeff Mitchell 2006-08-29 23:28:17 UTC
Krzysztof, thanks for all your help with this, but some devs have cold feet about writing tags for file tracking (understandably) and the majority has spoken; we're going to use a read-only method that's less robust in terms of when it will or will not lose files, but is read-only so involves no risk.  It also handles read-only collections, which has been a concern.  I will be keeping the code you helped me with so that we can use it for safe saves when we feel it necessary; thanks for all of your help.
Comment 82 Jeff Mitchell 2006-08-29 23:32:01 UTC
Fixed in revision 578666.
Comment 83 Krzysztof Lichota 2006-08-31 21:44:00 UTC
> I think this takes care of all the issues you saw.  Thanks for all the help (and if you see
> anything else please let me know).  Would you feel comfortable using this now?  :-) 

Yes, I would be comfortable. I will try it when it comes out/if it comes out :)

> Krzysztof, thanks for all your help with this, but some devs have cold feet about writing tags
>  for file tracking (understandably) and the majority has spoken; we're going to use a
> read-only method that's less robust in terms of when it will or will not lose files, but is
> read-only so involves no risk.  It also handles read-only collections, which has been a
> concern. 

Certainly, if there is a way to do it reliably without modifying files, go for it.

> I will be keeping the code you helped me with so that we can use it for safe saves when we
> feel it necessary; 

This approach can be used for any modifications of files, not only tags, so it might be useful somewhere in Amarok.

> thanks for all of your help. 

I am glad I could help :)