Bug 402029 - Find New Items process takes significantly longer to finish on 6.0.0
Summary: Find New Items process takes significantly longer to finish on 6.0.0
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Database-Sqlite (show other bugs)
Version: 6.0.0
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-12-11 23:53 UTC by SkyDiver
Modified: 2018-12-18 17:43 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 6.0.0


Attachments
digikam-5.9.0-01-x86-64.appimage output (13.92 KB, text/plain)
2018-12-14 10:32 UTC, SkyDiver
Details
digikam-6.0.0-beta3-20181208T111552-x86-64.appimage output (13.51 KB, text/plain)
2018-12-14 10:33 UTC, SkyDiver
Details
6.0.0-beta3 DB lock log (85.70 KB, text/plain)
2018-12-14 13:45 UTC, SkyDiver
Details

Note You need to log in before you can comment on or make changes to this bug.
Description SkyDiver 2018-12-11 23:53:48 UTC
STEPS TO REPRODUCE
1. Open Digikam
2. Wait for Find New Items process (that runs at startup) to finish
 

OBSERVED RESULT
On 6.0.0-beta3(AppImage) the process takes 2:31 minutes to finish.

EXPECTED RESULT
On 5.9.0(AppImage) (as well as 5.6.0 native-install) the process takes 7 seconds to finish.


SOFTWARE/OS VERSIONS
Linux: Ubuntu 18.04
KDE Frameworks Version: KDE Frameworks 5.51.0
Qt Version: Qt 5.9.7 (built against 5.9.7)
The xcb windowing system
Comment 1 Maik Qualmann 2018-12-12 07:13:16 UTC
Do you use SQLite or MySQL? How many items/images does your collection have? MySQL has become even faster, but we now recognize changes in sidecars and test for each file whether a sidecar is present. But that does not really make a difference in my speed here. My scarcely 50000 pictures are always scanned in 2-3 seconds. Is your collection on a network drive?

Maik
Comment 2 Maik Qualmann 2018-12-12 11:40:00 UTC
Git commit 2dbaa83b80de60e524c042b86c11ae860d52e2a5 by Maik Qualmann.
Committed on 12/12/2018 at 11:38.
Pushed by mqualmann into branch 'master'.

scan only for sidecars when read from sidecar is enabled

M  +4    -3    core/libs/database/collection/collectionscanner_scan.cpp
M  +3    -2    core/libs/database/item/scanner/itemscanner.cpp

https://commits.kde.org/digikam/2dbaa83b80de60e524c042b86c11ae860d52e2a5
Comment 3 SkyDiver 2018-12-12 18:14:03 UTC
I already got burnt once for switching to MySQL (tags got messed up AFAIR) so I'm running the SQLite DB arch.
The DB resides on an internal SSD HDD.
I have ~37K images in this DB instance.
Comment 4 Maik Qualmann 2018-12-13 20:24:48 UTC
I suspect a problem with the AppImage, in your configuration it should not be 5 seconds. Here with the AppImage and openSUSE Tumbleweed but I have no problems, there are no other changes in the digiKam6 New Item Scan, which could explain such a behavior. Is there any messages on the console when the scan is in progress?

Maik
Comment 5 SkyDiver 2018-12-14 10:32:10 UTC
Created attachment 116914 [details]
digikam-5.9.0-01-x86-64.appimage output
Comment 6 SkyDiver 2018-12-14 10:33:30 UTC
Created attachment 116915 [details]
digikam-6.0.0-beta3-20181208T111552-x86-64.appimage output
Comment 7 SkyDiver 2018-12-14 10:39:42 UTC
(First time attaching files here so I wasn't sure if they add up to a single comment or not.. guess they're not)

Well, I attached the log files of running 5.9.0 and 6.0.0-beta3.
Both run as an AppImage so I don't see AppImage being the root cause.

The scanning durations are consistent.

To summarize, here's the 5.9.0 result:
digikam.database: Complete scan took: 7366 msecs.

And here's 6.0.0-beta3's result:
Digikam::CollectionScanner::completeScan: Complete scan took: 152714 msecs.


Without probing code diffs, it looks like the scanning code has been refactored.
Comment 8 Maik Qualmann 2018-12-14 12:36:55 UTC
Tested here with about 30000 files and SQLite on HDD and image collection on HDD:

AppImage-5.9.0: 1800-2000 ms
AppImage-6.0.0: 800-1000 ms

Let's see what the next AppImage looks like without looking for sidecars when it's not activated. Which file system do you use?

Maik
Comment 9 SkyDiver 2018-12-14 13:43:48 UTC
Local system: ext4
Remote (NAS): cifs mount
Comment 10 SkyDiver 2018-12-14 13:45:09 UTC
Created attachment 116917 [details]
6.0.0-beta3 DB lock log

One important thing to add:
Performing any operation, even scrolling the thumbnail view, will lock the DB.

I tested the case of opening an image as soon as the UI shows up.

In 5.9.0 the UI hangs for a few short seconds and then goes into preview mode.

In 6.0.0-beta3, the UI hangs for a very long period of time.
Attached if the log of one test I recoreded: It hanged for ~1.5 minutes, during which the progress bar constantly showed 0%.
Once the image opened in preview mode, the progress bar showed 61% and remained that way until the full 2.5 minutes passed and the scanning was done.
My guess is that it was still stuck because the thumbnails above the focused image in preview mode were all blank except for the displayed image.
(*) Note that starting at line 688, a second DB lock starts running in parallel to the first one.
Comment 11 SkyDiver 2018-12-15 11:52:37 UTC
@Gilles: I see you've set the component to Database-Mysql.
Maybe you're analysis is that it is somehow related, but the DB related to this ticket is SQLite based (I don't use MySQL).
Comment 12 SkyDiver 2018-12-17 20:13:15 UTC
@Maik: Latest 6.0.0-beta3.AppImage scans for new images in 6 seconds consistently.
(*) I call this an improvement over the 7 seconds it takes 5.9.0.
Comment 13 Maik Qualmann 2018-12-17 20:23:13 UTC
Ok, now just enable the "Read from Sidecars" option under the metadata settings. If the scan takes a long time now, your file system takes a long time to find an answer to the fact that after a searched filename, this does not exist.

Maik
Comment 14 SkyDiver 2018-12-18 07:38:09 UTC
Indeed the issue reappears when enabling "Read from sidecar files".

Inferior as it is (although it's core is written in C), I just wrote a quick PHP script that searches for a non-existing file at the album's root directory.
A loop of 100,000 iterations finished in 0.116 seconds.

I tested the NAS drive both as a cifs mount and as a nfs mount.
The results are the same.

A problem in my file system is one way to look at things.
But I'm not sure why this is even a factor, when the same configuration (sidecar files scan enabled and all) finishes in 7 seconds in v5.9.0.
Comment 15 Maik Qualmann 2018-12-18 10:29:09 UTC
In digiKam-5.9.0, searching for new items does not look for sidecars, which is why digiKam can not detect any changes in sidecars. We're doing some more directory operations, LR sidecar available, or even in digiKam format, etc. But I already have a patch in progress that guarantees a constant speed for all file systems.

Maik
Comment 16 SkyDiver 2018-12-18 13:10:08 UTC
5.9.0 has the "Read from sidecar files" and it's enable by default.
If by not doing anything it doesn't slow Digikam on my machine, then fine.

Ping when you have a fix and I'll test it on my machine.
Comment 17 Maik Qualmann 2018-12-18 17:43:18 UTC
Git commit 874a534161200cc7fb74221662ed8c3e6f418220 by Maik Qualmann.
Committed on 18/12/2018 at 17:42.
Pushed by mqualmann into branch 'master'.

read from the loaded directory QStringList if the sidecar is present
FIXED-IN: 6.0.0

M  +2    -1    NEWS
M  +1    -1    core/libs/database/collection/collectionscanner.h
M  +51   -30   core/libs/database/collection/collectionscanner_scan.cpp

https://commits.kde.org/digikam/874a534161200cc7fb74221662ed8c3e6f418220