Bug 301266 - amarok creates duplicate ID when performing full scan of collection
Summary: amarok creates duplicate ID when performing full scan of collection
Status: RESOLVED NOT A BUG
Alias: None
Product: amarok
Classification: Applications
Component: Collections/Local (show other bugs)
Version: 2.5.90 (2.6 beta)
Platform: unspecified Linux
: NOR normal
Target Milestone: 2.6
Assignee: Amarok Developers
URL:
Keywords: regression
Depends on:
Blocks:
 
Reported: 2012-06-06 09:35 UTC by maxime.haselbauer
Modified: 2012-09-12 16:26 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
The error message (251.34 KB, image/png)
2012-06-06 09:36 UTC, maxime.haselbauer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description maxime.haselbauer 2012-06-06 09:35:28 UTC
When remove the complete collection
rm ~/.kde/share/apps/amarok/mysqle/amarok/*

And rescanning everything, amarok seems to create a duplicate Id for couple of tracks


I let the developeers decide it it is related to Bug 290135 (problem of a Unique id tag in the file)
I have watch the tag with kid3 and I haven't seen a "unique id tag", only a "WM/UniqueFileIdentifier" (Probably something heritated from Windows media player) whicht value is completely different from the id mentioned by amarok in the erro message

Reproducible: Always
Comment 1 maxime.haselbauer 2012-06-06 09:36:28 UTC
Created attachment 71617 [details]
The error message

Not that it is a complete rescan of the collection, also it should not create a duplicate id
Comment 2 Matěj Laitl 2012-06-16 11:02:52 UTC
Maxime, Amarok does not "create" unique ids when performing full rescan of the collection. It merely reads the id from track (when previously created by a musicbrainz tagger of amarok_afttagger) or if there is no such track, it generates it from track meta-data.

Is there a change you have 2 identical (or with identical unique ids) tracks in your collection? In either case, please start Amarok from console as `amarok --debug`, perform full rescan a attach its full output.
Comment 3 maxime.haselbauer 2012-06-17 18:37:02 UTC
Thks for feedback Matej !

I retried with this morning git version . It does not say anymore " double UID" but in debug mode I can see that it raise Warnings (always with the same tracks) saying that the id is "alreday commited" 

I don't think that there is a duplicate of those songs in the collection.

Do  you know how to retrieve the log results that is being produce when I launch amarok -debug (so that I can send it to you )? 

I also notice some other probelm : sometimes after a scan I only have 4900 songs  then retscaning will let appear 5300 ... then the next day I will only have 5100 anymore. I guess it is related to this
Comment 4 Myriam Schweingruber 2012-06-18 13:16:05 UTC
Thank you for the feedback.
Comment 5 Myriam Schweingruber 2012-06-18 15:07:31 UTC
To get the output, please run the following:

amarok -d --nofork 2>> output.txt

and attach the resulting text file after the full rescan.
Comment 6 maxime.haselbauer 2012-06-18 21:24:22 UTC
oops that's the shame: thanks to the output i have seen that I indeed had doppelgangers in my collection...
I place it on solve, I will see if it also solve the other problems I have.

One suggestions maybe is if the error message could me more explicit ?
instead of saying duplicate uid, that amarok collection scanner shows a dialog box saying " uid for file $pathA is already commited for $pathB. $pathA and $pathB might be the same song"


Anyway thank you for your time!
Comment 7 Myriam Schweingruber 2012-06-18 22:36:54 UTC
Thank you for the feedback.
Comment 8 piedro 2012-09-10 22:02:26 UTC
amarok 2.6 - I have the same problem making it impossible to reliably scan 27000 songs. And since there is no possibility to add singel files to the collection without amarok to copy or move them (I don't wnat amarok to mess up my folder system!) - I am stuck and after things worked better before this sucks now. I wasted too much time on this. It's a showstopper and an ambitious software like amarok should FIRST lay the foundation of a REALLY solid collcetion management. 

thx for your work guys, it's really appreciated, but you should also hire a project manager who's checking for the important stuff ... no offense, just my 2 cents, but I will use another music manager from now on ... sry 

hit gold one day! 
thx for reading, 
piedro
Comment 9 Matěj Laitl 2012-09-12 09:50:50 UTC
(In reply to comment #8)
> amarok 2.6 - I have the same problem making it impossible to reliably scan
> 27000 songs.

Piedro, this this is vague as hell. What *exact* problem do you have? Files not showing in collection browser while they should? Orher way around? Do you have "Watch folders for changes" enabled? Do you hit "Update Collection" or "Full rescan"? What your output of `amarok -d --nofork` is (while performing the scan)? What does your Diagnostics page say?

> And since there is no possibility to add singel files to the
> collection without amarok to copy or move them (I don't wnat amarok to mess
> up my folder system!)

What??? I don't understand, explain.

> I am stuck and after things worked better before
> this sucks now.

How this worked before and what changed?

> I wasted too much time on this.

I still don't understand what you were doing all the time.

> It's a showstopper and an ambitious software like amarok should FIRST lay the
> foundation of a REALLY solid collcetion management. 
> thx for your work guys, it's really appreciated, but you should also hire a
> project manager who's checking for the important stuff ... no offense, just
> my 2 cents, but I will use another music manager from now on ... sry 

Piedro, it seems you don't understand FLOSS principles. We are a volunteer project. There's no word like "hire". Or: I'd really like to work as an Amarok project manager, would *you* hire me?
Comment 10 piedro 2012-09-12 14:46:43 UTC
Matej, thx for your clarifications: 

What I did is: 
First: all the files are perfectly and completely tagged with musicbrainz picard. No nonstandard codecs or suspiciously tagged files are included in my collection. Most of my songs are directly ripped from CD, that's why I like the album centric way of collection management.    

I reinstalled amarok completey (including delteing the database). I let amarok perform a full scan of my whole collection. Duplicates are found (though these are different files, different versions of the song in some cases even different artists. Leading to a collection that has many songs missing in albums (one David Bowie album for example showed only song 2 and 14 of 18 songs). So if I want to add an album to the playlist as whole I can't do it. So I am basically forced to listen music trackwise. That is not how I listen music: I like to listen to an album as whole (as often intended by the artist). 

I used afttagger to get this additional uniquness and this took around six hours. (At least by some errors I founf lost clusters on my harddrive and fixed it). Before I had about at least one song missing per every 5 albums altogether about 5% of my files wouldn't show up. Now amarok found (I hope) all of my songs. The next problem is that the songs that have been missing before and now reappeared missed the changes I applied to some albums. For example I didn't want some albums to show up under "Various artists" so I deleted the "Show as compilation" flag on these. Now the reappearing songs still show up as part of a compilation. There are other cases, like I changed the spelling or the genre of an album. These changes haven't been performed with the "new" files. At least it works. But it will take more hours to clean stuff up if I want to have the collection in the same consistent state as before. 

When adding files to my music collection it is troublesome to get amarok to add the files in exactly the same folder structure as I used to build the collection with picard. picard uses some plugins like "don't use subfolder for multiple disks of one album". I have never been able to get stuff managed by amarok in the same way I planned it out and realized it with picard. BTW: I don't know if the picard plugins are used by the integrated musicbrainz client in amarok ... What I mean by "messed up" is I use amarok to add files and suddenly have two folders for the same artist on my harddrive "The Corrs" and "Corrs, The". I can give you many more examples but htis is getting long. 

Now to make sure that songs I add later don't show as duplicates again I have to run the afttagger again I assume. that's inconvenient. 


Matej, I fully understand the FLOSS principles. There's many advantages in that. And some disadvantages. It's volunteered work, no clear goals and lot's of enthusiasm for some parts of the work ("more features, yeah") and less for others (UI bugs, system integration, communication). 

The KDE project uses Amarok to advertise the multimedia management capabilites of their desktop, the amarok project advertising is even more explicit. That doesn't match the reality. If KDE/Amarok want to advertise their advantages they have to be as clear about the disaadvantages and restrictions. I never read: "Amarok is a great track-centric player that doesn't allow for duplicates in albums (be aware!)". I always thought that the databased nature of the collection management would help to preserve my collection data without rereading it ever. But every problem I try to solve with amarok first tip I get is to clear the database. That turns people off!

Yes I think you are right: there should be funding to hire a project manager. Whether it's you I don't know ... the idea is to have someone with the right qualification which is basically the opposite of a technical skill. But I don't think money solves the flaw. For projects like this someone has to do the dirty unpleasant work (he/she should get paid) and someone has to make sure things are what the users (even FLOSS expects happy users) need to be able to do THEIR work THEIR way. Otherwise things stay as they are and we have great ideas and great software that just doesn't work flawless enough to be productive with it. Those are showstoppers. (look at kmail/kontact, akonadi, plasma, amarok, kmymoney ...) 

Let me note one more thing: Me and many people I know would love to pay for a solid system integrated PIM solution or a good Webeditor suite, financial software or ERP even a music collection manager (while staying with GNU/Linux - I don't see any alternative to that). But it has to prove it's quality. And perceived quality is very differnt from a users or a developers standpoint. 

thx for reading and thanks for your continuing effort to improve software, 
piedro
Comment 11 Matěj Laitl 2012-09-12 15:11:06 UTC
(In reply to comment #10)
> What I did is: 
> First: all the files are perfectly and completely tagged with musicbrainz
> picard. No nonstandard codecs or suspiciously tagged files are included in
> my collection. Most of my songs are directly ripped from CD, that's why I
> like the album centric way of collection management.    
> 
> I reinstalled amarok completey (including delteing the database). I let
> amarok perform a full scan of my whole collection. Duplicates are found
> (though these are different files, different versions of the song in some
> cases even different artists. Leading to a collection that has many songs
> missing in albums (one David Bowie album for example showed only song 2 and
> 14 of 18 songs). So if I want to add an album to the playlist as whole I
> can't do it. So I am basically forced to listen music trackwise. That is not
> how I listen music: I like to listen to an album as whole (as often intended
> by the artist). 
> 
> I used afttagger to get this additional uniquness and this took around six
> hours. (At least by some errors I founf lost clusters on my harddrive and
> fixed it). Before I had about at least one song missing per every 5 albums
> altogether about 5% of my files wouldn't show up. Now amarok found (I hope)
> all of my songs.

So, the problem vanished, right? Using afttagger isn't necessary. You should've reported the problem before using it *and* providing the debug log. Now we cannot see what happened.

> The next problem is that the songs that have been missing
> before and now reappeared missed the changes I applied to some albums. For
> example I didn't want some albums to show up under "Various artists" so I
> deleted the "Show as compilation" flag on these. Now the reappearing songs
> still show up as part of a compilation.

Result of you deleting the database, we cannot do anything about it.

> There are other cases, like I
> changed the spelling or the genre of an album. These changes haven't been
> performed with the "new" files. At least it works. But it will take more
> hours to clean stuff up if I want to have the collection in the same
> consistent state as before.

This should have been saved to the files. (unless they're read-only, you compiled Amarok w/out TagLib or activated hidden option not to write tags back to files) If they're read-only, we can do nothing about this as it is a logical consequence of you deleting the database.

> When adding files to my music collection it is troublesome to get amarok to
> add the files in exactly the same folder structure as I used to build the
> collection with picard. picard uses some plugins like "don't use subfolder
> for multiple disks of one album". I have never been able to get stuff
> managed by amarok in the same way I planned it out and realized it with
> picard.

This is an entirely different thing. You're just complaining that Amarok doesn't have the same feature-set of Picard. It will never have, it is a music *player*, not organizer. You can organize all your files using Picard, Amarok picks up everything you put into its configured collections.

> BTW: I don't know if the picard plugins are used by the integrated
> musicbrainz client in amarok ... What I mean by "messed up" is I use amarok
> to add files and suddenly have two folders for the same artist on my
> harddrive "The Corrs" and "Corrs, The". I can give you many more examples
> but htis is getting long.

Certainly not Amarok's fault, it just uses your tags. It they're messed up, it can do nothing about it. Furthermore it gives you preview or resulting locations.

> Now to make sure that songs I add later don't show as duplicates again I
> have to run the afttagger again I assume. that's inconvenient.

Another false assumption you have. This just isn't true.

> Matej, I fully understand the FLOSS principles. There's many advantages in
> that. And some disadvantages. It's volunteered work, no clear goals and
> lot's of enthusiasm for some parts of the work ("more features, yeah") and
> less for others (UI bugs, system integration, communication). 
> 
> The KDE project uses Amarok to advertise the multimedia management
> capabilites of their desktop, the amarok project advertising is even more
> explicit. That doesn't match the reality. If KDE/Amarok want to advertise
> their advantages they have to be as clear about the disaadvantages and
> restrictions. I never read: "Amarok is a great track-centric player that
> doesn't allow for duplicates in albums (be aware!)". I always thought that
> the databased nature of the collection management would help to preserve my
> collection data without rereading it ever. But every problem I try to solve
> with amarok first tip I get is to clear the database. That turns people off!

By who? We advise to clear the database (with backup!) to *diagnose* the problem, not to *solve* it.

> Yes I think you are right: there should be funding to hire a project
> manager. Whether it's you I don't know ... the idea is to have someone with
> the right qualification which is basically the opposite of a technical
> skill. But I don't think money solves the flaw. For projects like this
> someone has to do the dirty unpleasant work (he/she should get paid) and
> someone has to make sure things are what the users (even FLOSS expects happy
> users) need to be able to do THEIR work THEIR way. Otherwise things stay as
> they are and we have great ideas and great software that just doesn't work
> flawless enough to be productive with it. Those are showstoppers. (look at
> kmail/kontact, akonadi, plasma, amarok, kmymoney ...) 
> 
> Let me note one more thing: Me and many people I know would love to pay for
> a solid system integrated PIM solution or a good Webeditor suite, financial
> software or ERP even a music collection manager (while staying with
> GNU/Linux - I don't see any alternative to that). But it has to prove it's
> quality. And perceived quality is very differnt from a users or a developers
> standpoint.

Completely unrelated to this bug report.

> thx for reading and thanks for your continuing effort to improve software, 
> piedro

Unfortunately reading your essays is getting harder and harder. Only useful sencence in your comment was "Now amarok found (I hope) all of my songs." along with the fact that you've removed your database, which means you cannot help us in any way to diagnose the problem now. Leaving the bug as invalid.

Note: I will now only respond to comments that *directly relate* to this bug so that I can spend my time on other bugs rather than with reading random rants.
Comment 12 piedro 2012-09-12 16:26:35 UTC
Thx for your time. 

But you are right: we are wasting it. Plz forget my amarok problem - you are right here also - it is completely not existing and my fault in every aspect. I apologize for the randomness of my concerns. I am not well trained in rant-free-up-to-the-point-problem-solving discussions as you are. 

I learned: the bug is invalid, my concerns are invalid, my comments wrong, my writing plainly nonsense (apart from one sentence), problems are unrelated to their context, the sun always shines if I close my eyes and I promise to never even try again to contrbute to a bug report on amarok again! 

thx for you not commenting unrelated, 
piedro