Bug 428772 - JuK painful folder scanning at the EVERY (Re)start
Summary: JuK painful folder scanning at the EVERY (Re)start
Status: RESOLVED FIXED
Alias: None
Product: juk
Classification: Applications
Component: general (show other bugs)
Version: 20.08
Platform: Arch Linux Linux
: NOR major
Target Milestone: ---
Assignee: Scott Wheeler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-11-06 23:01 UTC by ivan.planinar
Modified: 2021-03-26 02:00 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 21.04


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description ivan.planinar 2020-11-06 23:01:39 UTC
SUMMARY

*Every* time JuK is (re)started it scans my massive collection of music files. It's the most frustrating after reboot, when it's literally roasting and roaring my HDD with its scan. Delays are big, almost ~1 minute I cannot use the program.


STEPS TO REPRODUCE
1. Have big music collection.
2. Just start JuK 

OBSERVED RESULT

Massive delays, painful sound in the HDD.

EXPECTED RESULT

NO Scanning. If it scanned one time it's enough to watch for new files. Or to improve its speed. Or just to have an option to disable re-scanning altogether.

The current situation is so very frustrating that I'm planning to stop using JuK entirely if this isn't solved somehow. 

Sorry guys I love your app, but Quality of Life is much needed here.
Comment 1 Michael Pyne 2021-03-26 02:00:36 UTC
Git commit d6b28a9b4c8e21a0b9ccd5bb7585091e501d94ab by Michael Pyne.
Committed on 26/03/2021 at 01:48.
Pushed by mpyne into branch 'master'.

tag_scan: Fix painful rescan of music metadata on startup.

For the longest time, JuK has suffered from a problem where its intended
behavior to load music metadata from a cached database, instead of
re-scanning every individual track on startup, was not working right.

There has been debugging lines in JuK since all the way back to 2013
trying to trace what area of startup sequence was taking up all the
time, but to no avail in helping me fix the bug.

The Problem
===========

Recently I took a different approach, of adding a debug-only crash
whenever we would load a music track tag the "slow" way, and long story
short there were two bugs that each would cause slowdown:

1. Playlists aside from the CollectionList would cause every music track
   in that playlist to be re-scanned. What this means is that every
   though the music in the CollectionList would be loaded quickly, if
   you had that same music track in a separate Playlist, that music
   track would reload the same tags from disk rather than copying from
   the existing CollectionList item.  This was especially bad for users
   of the old "tree mode" view, since every individual artist *and*
   album were rendered as individual playlists, which would therefore
   each re-scan the music over and over again.
2. JuK supports a "folder scan" feature, and in fact really wants the
   user to have at least one folder assigned. Any music identified in
   this folder is added to the CollectionList automatically on startup
   and, you guessed it, causes the music track information to be loaded
   from disk, even if the music was already in the CollectionList! :(

The net effect is that most music would be re-scanned on startup unless
you were a user who used CollectionList exclusively, and had most of
your music not visible to the folder scanner.

The Solution
============

Due to how painful this problem has been, I had ended up adding a
threaded solution for the folder scan process. This didn't help make
things any faster but at least the GUI wasn't frozen. But now that the
threading code is present I judged it would be easier and safer to make
the central object holding track metadata (CollectionList's m_itemsDict)
available in thread-safe fashion.

This then permitted me to check for whether a track has already been
loaded when performing folder scan, and to check whether a track has
already been loaded when creating a new (non-CollectionList) Playlist.
In either event if the track already exists, then we copy the FileHandle
rather than create a new one.

The combination speeds up loading significantly, taking anywhere from
60% to 70% off of the total time to load on my system, with mostly a
CollectionList under folder scan and few additional playlists. In this
configuration I go from about 5.4 seconds to 1.5 seconds with cold
caches. The difference should be even more stark on systems where disk
I/O is expensive, or where there are a great number of tracks in
playlists outside of the CollectionList.

I consider this a bugfix (and there are even multiple bug reports) so I
will backport shortly.

CHANGELOG:Reduce startup time by 60-70% or more.
Related: bug 317666
FIXED-IN:21.04

M  +58   -25   collectionlist.cpp
M  +5    -3    collectionlist.h
M  +11   -3    directoryloader.cpp
M  +6    -1    filehandle.cpp
M  +10   -1    playlist.cpp

https://invent.kde.org/multimedia/juk/commit/d6b28a9b4c8e21a0b9ccd5bb7585091e501d94ab
Comment 2 Michael Pyne 2021-03-26 02:00:53 UTC
Git commit a65a4e8a037bae5d2731267b60ed61bf09526413 by Michael Pyne.
Committed on 26/03/2021 at 01:52.
Pushed by mpyne into branch 'release/21.04'.

tag_scan: Fix painful rescan of music metadata on startup.

For the longest time, JuK has suffered from a problem where its intended
behavior to load music metadata from a cached database, instead of
re-scanning every individual track on startup, was not working right.

There has been debugging lines in JuK since all the way back to 2013
trying to trace what area of startup sequence was taking up all the
time, but to no avail in helping me fix the bug.

The Problem
===========

Recently I took a different approach, of adding a debug-only crash
whenever we would load a music track tag the "slow" way, and long story
short there were two bugs that each would cause slowdown:

1. Playlists aside from the CollectionList would cause every music track
   in that playlist to be re-scanned. What this means is that every
   though the music in the CollectionList would be loaded quickly, if
   you had that same music track in a separate Playlist, that music
   track would reload the same tags from disk rather than copying from
   the existing CollectionList item.  This was especially bad for users
   of the old "tree mode" view, since every individual artist *and*
   album were rendered as individual playlists, which would therefore
   each re-scan the music over and over again.
2. JuK supports a "folder scan" feature, and in fact really wants the
   user to have at least one folder assigned. Any music identified in
   this folder is added to the CollectionList automatically on startup
   and, you guessed it, causes the music track information to be loaded
   from disk, even if the music was already in the CollectionList! :(

The net effect is that most music would be re-scanned on startup unless
you were a user who used CollectionList exclusively, and had most of
your music not visible to the folder scanner.

The Solution
============

Due to how painful this problem has been, I had ended up adding a
threaded solution for the folder scan process. This didn't help make
things any faster but at least the GUI wasn't frozen. But now that the
threading code is present I judged it would be easier and safer to make
the central object holding track metadata (CollectionList's m_itemsDict)
available in thread-safe fashion.

This then permitted me to check for whether a track has already been
loaded when performing folder scan, and to check whether a track has
already been loaded when creating a new (non-CollectionList) Playlist.
In either event if the track already exists, then we copy the FileHandle
rather than create a new one.

The combination speeds up loading significantly, taking anywhere from
60% to 70% off of the total time to load on my system, with mostly a
CollectionList under folder scan and few additional playlists. In this
configuration I go from about 5.4 seconds to 1.5 seconds with cold
caches. The difference should be even more stark on systems where disk
I/O is expensive, or where there are a great number of tracks in
playlists outside of the CollectionList.

I consider this a bugfix (and there are even multiple bug reports) so I
will backport shortly.

CHANGELOG:Reduce startup time by 60-70% or more.
Related: bug 317666
FIXED-IN:21.04

(cherry picked from commit d6b28a9b4c8e21a0b9ccd5bb7585091e501d94ab)

M  +58   -25   collectionlist.cpp
M  +5    -3    collectionlist.h
M  +11   -3    directoryloader.cpp
M  +6    -1    filehandle.cpp
M  +10   -1    playlist.cpp

https://invent.kde.org/multimedia/juk/commit/a65a4e8a037bae5d2731267b60ed61bf09526413