Bug 205284 - Duplicate entries in dynamic playlist
Summary: Duplicate entries in dynamic playlist
Status: RESOLVED DUPLICATE of bug 175172
Alias: None
Product: amarok
Classification: Applications
Component: Playlists/Dynamic Playlists (show other bugs)
Version: 2.3-GIT
Platform: Compiled Sources Unspecified
: NOR wishlist
Target Milestone: ---
Assignee: Amarok Developers
URL:
Keywords:
: 206758 209121 240518 (view as bug list)
Depends on:
Blocks:
 
Reported: 2009-08-27 04:56 UTC by Mikael Lammentausta
Modified: 2010-12-13 11:33 UTC (History)
9 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mikael Lammentausta 2009-08-27 04:56:03 UTC
Version:           2.2-GIT (using KDE 4.3.0)
Installed from:    Compiled From Sources

When I am playing a track and using custom bias (especially from Echo Nest) to create a dynamic playlist, there are times when just a single tracks is added ~10 times to the playlist.

This happens when the track is the type of music that doesn't have many similar tracks in my collection, and custom bias only has limited tracks to select from.

At worst, just a single track is added to playlist many times. Sometimes I get many different tracks with some track repeating many times.

Please, filter the dynamic playlist with uniq() function that removes duplicates.
Comment 1 Leo Franchi 2009-08-27 05:08:52 UTC
however, what would happen then? if there are duplicates, that's usually  because the set of matching tracks is really small. the dynamic playlist wouldn't even be able to fill the 10-track list sometimes.
Comment 2 Mikael Lammentausta 2009-08-27 05:16:56 UTC
Yes.

IMO if the track number < 10 it would be more friendly to only add each track once, down to 1 and 0 tracks added to the playlist.

If I only get one track, I am not very likely to actually *play* the track 10 times. ;) Much rather I'd repopulate the dynamic playlist based on this single track from this selection.
Comment 3 Myriam Schweingruber 2009-09-09 00:11:52 UTC
*** Bug 206758 has been marked as a duplicate of this bug. ***
Comment 4 Marcos David 2009-09-09 10:49:13 UTC
Hi, I have a large collection (>6000 files) and a lot of similar artists in my collection.
Sometimes I get duplicates if the same song is on different albums.
Sometimes the same song from the same album gets repeated.

To simplify the problem imagine this, I only have 3 albums from 3 similar artists each with ~10 tracks.
In theory I could create 3 playlists with 10 songs,  without repeating one single song.
But what is happening is that in my first 10 song playlist the same song gets repeated 2 or 3 times. Even tough I have other tracks from the same artist.
Comment 5 Myriam Schweingruber 2009-10-01 18:10:37 UTC
*** Bug 209121 has been marked as a duplicate of this bug. ***
Comment 6 alphe323 2009-11-29 11:45:57 UTC
I experience the same issue with dynamic playlists.

My dynamic playlist is defined as:
- Proportional Bias, Proportion 50%, Score > 80
- Proportional Bias, Proportion 33%, Rating > no star

I have a large collection with more than 5000 songs but sometimes there are some songs that are repeated constantly (ex every 3-5 songs). They are not always the same song and a restart does not help.

Also even if I add a Proportional Bias with Last played less than today and proportion 100% the duplications are continues.
Comment 7 Myriam Schweingruber 2009-11-29 14:00:29 UTC
Well, the existing algorithm uses randomness, so it's perfectly normal to have duplicate entries when something is random. You can remove the duplicates with the Option "Remove Duplicates" in the playlist menu, already implemented in Amarok 2.2-git.
Comment 8 alphe323 2009-11-29 22:15:01 UTC
Let me try to give you an example to explain that something goes wrong with the randomness:

In a 5000 songs collection I create a dynamic playlist of 20 songs.

A song (different each time) appeared 4 times in this playlist and when I use the "Remove duplicates" this song will be one of the songs that will be added as replacement of the duplicates.

Even if I remove it manual it will be added the next time that a song must be chosen by the random algorithm.

As you can understand this is not so random.

Please mention that this specific song bypass the "Last played" rule that I mention above.
Comment 9 Valerian 2010-01-20 04:33:11 UTC
same problem here.
sometimes preconceived dynamic playlists generator appear.
i can not find trigger (it can generate really random playlists for a long time but suddenly became biased).
one track (different in different cases) appear frequently in playlist (and populate almost 50%, often tracks came in row, reappear if removed).

some reasoning:
i have at least 2 appropriate tracks (really more).
i can often see ~5 tracks at row.
the probability of this events is (1/2)^5 just ~3% (really less).
this event occur often. So

i think that this is not "bad random generator" but just "not random at all" generator (also not "wanted", but "not working" (sometimes) feature).
Comment 10 Heinz Wiesinger 2010-03-13 19:13:22 UTC
A feature wish I reported a while ago is related to this. It's bug 226291.

I think a new bias as I explain in my feature wish would be the best solution here, since some people may like to have the same song to appear in the playlist (for example if the collection is small and the playlist very long, corner case but possible)
Comment 11 Heinz Wiesinger 2010-03-13 19:14:54 UTC
damn, typo. It's bug 226971
Comment 12 Leo Franchi 2010-03-13 20:07:25 UTC
someone could write a fuzzy bias that severely penalizes playlists with duplicates.

this would make it much less likely that duplicates appear...


EXCEPT 

in the case where dups appear because there are no other tracks to use. in that case no matter how poorly the playlist is scored, there can't be any other way to make it better, so there will still be dups.
Comment 13 Mikael Lammentausta 2010-03-13 22:55:30 UTC
(In reply to comment #12)
> EXCEPT 
> 
> in the case where dups appear because there are no other tracks to use. in that
> case no matter how poorly the playlist is scored, there can't be any other way
> to make it better, so there will still be dups.

It is not very nice to have duplicates since then the playlist may get extremely repetitive in some cases.

Rather, in case there are not enough individual songs to add to playlist, I would like to have the playlist to add only the available songs, if any. So then the playlist could after the few tracks automatically stop, or be shorter when the matching tracks are limited until more unique matches are found. 

It would also be possible to calculate extra tracks from the unique results got from the first pass. Query again with the tracks in the playlist until enough unique tracks have been fetched.

Still, unless not enough tracks are found, either suggest the user to broaden the genre (wishful thinking?) or just add the unique tracks until playing stops.

This is how I feel after using Amarok and dynamic playlist feature on a large collection of various genres using Echo Nest and Last.fm suggestions for over a year. Often I just select a "begin genre" and let the dynamic playlist keep on for maybe 10+ hours. It works very well except for these duplicates. I don't want the player to obsess over a specific song, really.
Comment 14 ghaverla 2010-04-13 00:16:59 UTC
The Debian maintainer (Modestas Vainius) has pointed me to your bug report.  I've been documenting my own problems with this in Debian bug 576016.

This has only been happening on my computer for a couple of weeks.  What is currently installed is 2.3.0, but I think I've had 1 upgrade since this began.

This has all been with Dynamic Playlists.  I've had it happen with Last.fm suggestions on Artists, Last.fm suggestions on Tracks, and Echo Nest.  When I first noticed it, I had the proportion at over 80%.  It last happened at 72%.

The song that has been repeating, has occasionally been popular and sometimes obscure.  The repetition pattern has not been consistent.  I've seen runs of the same song of length about 10.  Sometimes it is only a single other song
breaking up a run, sometimes a small number, sometimes quite a few (15+).  The other songs involved in the repetitious behavior are sometimes the same, and
sometimes change.

I've got just under 3000 tracks in my library.

I've deleted the database twice (after backup).

This feels like bad use of random numbers (I've been doing stochastic models in engineering since 1984, mostly in FORTRAN).

What sort of crept into my head a day or so ago, was maybe you are using /dev/random (or /dev/urandom) as the source of random numbers.  Yesterday I tried looking around the Git, but none of the filenames look like an obvious spot for the random component.  Most of the time, I've seen this repetition when I've been sleeping (I have the music turned on low to sleep).  If you are drawing from /dev/random, maybe the system is running out of entropy?  I've got two Boinc projects (Seti and ClimatePrediction) running all the time.  I've no idea if either of them is drawing from /dev/random, but they might need random numbers.

For something like randomizing decks of playing cards, or playing music, a high quality RNG probably isn't needed.  Drawing a number from /dev/random to use in seeding sort of makes sense, but once running just about any good RNG (such as Numerical Recipes) would work.

In Dynamic Playlist mode (with suggestions), amarok sends a query to the other end, asking for a suggestion.  Ideally, we always have a local copy of what has been suggested.  We then draw a uniform deviate, and if the deviate lies between 0 and the set proportion, we accept the suggestion and play it.  Otherwise we play something locally that wasn't suggested.  What happens if we do not have the suggestion in our local archive, but we are supposed to play the suggested song (from the value of the deviate)?  Presumably we pick something local.

How do we choose something local?  If amarok has it's own RNG, depending on what kind of RNG, we probably don't want to go mucking about with seeding it.  Most of the easy RNGs do not have uniform cycle lengths as a function of seed.  A typical problem is give it a seed of 0.  Hence, your RNG operates with a given seed, or one of a few.  When you get your random integer (N Bytes), you do something like XOR it with the seed you got from /dev/random.  This way, you operate your RNG with maximum cycle length.  What operation you do to the random deviate to "combine" it with the seed from /dev/random when amarok started, I haven't looked into yet.  I suppose if I knew where the code was, I could look into it a little.

Sorry, this is too long.
Comment 15 Myriam Schweingruber 2010-08-11 08:38:30 UTC
*** Bug 240518 has been marked as a duplicate of this bug. ***
Comment 16 Tristan Miller 2010-11-07 13:57:49 UTC
*** This bug has been confirmed by popular vote. ***
Comment 17 Myriam Schweingruber 2010-12-13 11:33:23 UTC

*** This bug has been marked as a duplicate of bug 175172 ***