Bug 464769 - Rebuild Finger-Prints Process Slow
Summary: Rebuild Finger-Prints Process Slow
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Maintenance-Similarities (show other bugs)
Version: 8.0.0
Platform: Microsoft Windows Microsoft Windows
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-24 21:21 UTC by worthington_j
Modified: 2023-01-31 18:10 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 8.0.0


Attachments
Debug File of Finger-print Scan (110.89 KB, text/plain)
2023-01-25 22:43 UTC, worthington_j
Details

Note You need to log in before you can comment on or make changes to this bug.
Description worthington_j 2023-01-24 21:21:34 UTC
SUMMARY
***
With DigiKam 8.0 the process of Rebuilding Finger-Prints is taking much longer than previous versions.  When I was on version 7.x the process would take around 10 minutes.  I'm not seeing the process for the same amount of pictures take around 45 minutes.  I wish there was a way to ad a switch or something to run the this process automatically at startup.
***


STEPS TO REPRODUCE
1.  Go to Maintenance
2.  Check work on all processor cores when possible, check rebuild finger-prints, check check for changed or non-cataloged
3. Click Ok

OBSERVED RESULT
The process takes 45 minutes to complete

EXPECTED RESULT
10-15 minutes to complete

SOFTWARE/OS VERSIONS
Windows: 10
KDE Plasma Version: 8.0.0-beta1
KDE Frameworks Version: 5.99
Qt Version: 5.15.7

ADDITIONAL INFORMATION
During the scan digikam is using very little CPU and Memory that is available.  I have an intel i-7 processor.
Comment 1 Maik Qualmann 2023-01-24 21:39:25 UTC
A test here shows nothing conspicuous, it takes as long as always. Do they use SQLite or MySQL? Have you possibly even switched from SQLite to MySQL in the meantime? If only the date of already existing fingerprints is compared, there is not a large processor load. What image format do they mostly have?

Maik
Comment 2 worthington_j 2023-01-24 22:17:06 UTC
I am using SQLite. I will give MySQL a try to see if I notice a difference. I'm using .jpg images.
Comment 3 worthington_j 2023-01-24 23:32:18 UTC
Changing to MySQL did not help. Does image size make a difference?  I have a little over 12,000 new/changed items each time I perform the scan.  I can't figure out what has caused time to triple other than upgrading to 8.0.
Comment 4 worthington_j 2023-01-25 21:18:47 UTC
I timed a finger-print scan on the same pc to the same folder using digikam 8.0 & 7.9.  The scan took almost 30 minutes longer in the 8.0 version vs 7.9.

I love how accurate and easy this software is to use.  I use it different than most because I take hours of video and convert into jpg files at certain intervals. I then use digikam to search for an image that is similar to the image from the video.  This allows me to find things in a video quickly without the need to watch hours of footage.  If I had a switch that I could use at startup or someway of automatically kicking off the finger-print scan at launch this was save me a lot of time because I can script it to launch in the middle of the night so it is ready for me first thing in the morning.
Comment 5 Maik Qualmann 2023-01-25 21:36:00 UTC
We use ExifTool more intensively in digiKam-8.0.0 when Exiv2 has no or problems with the metadata. ExifTool is of course much slower.
Maybe your files have no metadata or corrupted metadata?

Please download and start DebugView from Microsoft. Activate internal debugging in the digiKam settings under Miscellaneous-> System (also makes digiKam a bit slower under Windows). Restart digiKam, run a fingerprint update. Post a larger snippet in DebugView when new fingerprints are created.

Maik
Comment 6 worthington_j 2023-01-25 22:43:18 UTC
Created attachment 155645 [details]
Debug File of Finger-print Scan

Attached is a snippet of the debug during the finger-print scan.  My images do not have any metadata.
Comment 7 worthington_j 2023-01-25 22:57:57 UTC
One thing I noticed is the longer the scan runs the slower it gets.  The first 20% takes less than 30 seconds to complete.  Then at 50% it's about 10 minutes in.  So by the time it's at 100% we are over an hour in.  The speed did not change on MySQL vs. SQLite.
Comment 8 Maik Qualmann 2023-01-26 07:08:49 UTC
Hmm, ExifTool is not the cause. If the screenshots are numbered consecutively, we can do about 130 fingerprints in 18 seconds. Such a comparison with digiKam-7.9.0 would be good.

The progress bar that goes fast is just the date comparison of the existing fingerprints in your DB, it slows down as new images are found.

Maik
Comment 9 worthington_j 2023-01-26 15:31:29 UTC
I'm not sure if you will be able to identify a bottleneck with this information but I captured the entire log file for a scan that took 81 minutes to complete. I scanned 12,348 jpg images and they are all numbered sequentially starting with 00001.jpg.  In the log file I calculated the finger-print scan time between each 25% of the total files.  Below was the results.

00001.jpg - 03087.jpg = 1 minute
03088.jpg - 06174.jpg = 8 minutes
06175.jpg - 09261.jpg = 23 minutes
09262.jpg - 12348.jpg = 49 minutes

I did find a way to perform a finger-print scan on all 12,348 files in less than 6 minutes using the process below.

1. Launch digiKam with an empty collection
2. Drop files 00001.jpg - 03087.jpg in the collection
3. Go to maintenance Check "Scan for new items", "Rebuild Finger-prints" & "Scan for changed or non-cataloged items (faster)"
   The scan takes a little over 1 minute to complete
4. Close digiKam and relaunch
5. Drop in files 03088.jpg - 06174.jpg in the collection
6. Go to maintenance Check "Scan for new items", "Rebuild Finger-prints" & "Scan for changed or non-cataloged items (faster)"
   The scan takes a little over 1 minute to complete
7. Repeat these steps for the next two groups of files you'll have all the finger-print scanning completed in under 6 minutes.

I'm not seeing digiKam consume a lot of RAM on the PC but it must be bogging down with cache or something that makes it slow down the longer the process runs.  Closing and relaunching the application seems to resolve that issue.
Comment 10 caulier.gilles 2023-01-26 16:19:43 UTC
The memory leaks in fingerprints algorithm was already reported, but never reproduced in development platforms. In fact it's memory allocated and des-allocated at end of the digiKam session. It's not lost definitively.

The reason is unknown. It's possible due to compiler version used or Windows core libraries versions which do not match exactly (or buggous, thanks M$)

We pass the source code in plenty of static analyzers, more and less performant, and we never found problems. Idem for the runtime memory check, but this one introduce time latency in the program, so if memory leaks are due to race conditions, we will never able to investigate the problem and fix...

Gilles Caulier
Comment 11 Maik Qualmann 2023-01-26 21:25:59 UTC
I deleted the fingerprint table in the database and started a build in my collection. I can reproduce the problem, after about 3000 images the speed slows down. I will fix it.

Maik
Comment 12 Maik Qualmann 2023-01-27 21:08:54 UTC
Git commit b543ab1498dcb8fdd4a4178fc0a63fa6570a664d by Maik Qualmann.
Committed on 27/01/2023 at 21:07.
Pushed by mqualmann into branch 'master'.

fix slow down in the LoadingCache with many small images
With fingerprints and a maximum of 400MB cache,
up to 30000 images can be cached, which led to
unnecessary loops in the image file watch.
FIXED-IN: 8.0.0

M  +2    -1    NEWS
M  +27   -48   core/libs/threadimageio/fileio/loadingcache.cpp
M  +3    -8    core/libs/threadimageio/fileio/loadingcache.h

https://invent.kde.org/graphics/digikam/commit/b543ab1498dcb8fdd4a4178fc0a63fa6570a664d
Comment 13 worthington_j 2023-01-27 21:53:49 UTC
Is it still possible to compile the latest changes for Windows?  It appears the KDE on Windows page is down or no longer exist (https://userbase.kde.org/Digikam/Windows).  I would like to test the new update if possible.
Comment 14 Maik Qualmann 2023-01-27 21:58:36 UTC
To be honest, I don't even know that website. the current test versions of digiKam-8.0.0-Beta1 can officially be found here:

https://files.kde.org/digikam/unstable/

Gilles, I'm sure he'll compile a new version soon, check the date in the filename.

Maik
Comment 15 worthington_j 2023-01-27 22:01:54 UTC
Great, thanks for looking at this and fixing the issue!
Comment 16 worthington_j 2023-01-31 18:10:03 UTC
I installed the unstable release that was made available today and ran a finger-print scan.  A scan on over 12,000 files took 1 minute and 6 seconds to complete.  Thank you very much for fixing this issue!!!!!