Bug 478554

Summary: Very slow maintenance process with mariadb
Product: [Applications] digikam Reporter: maderios <leoutation>
Component: Database-MysqlAssignee: Digikam Developers <digikam-bugs-null>
Status: REPORTED ---    
Severity: normal CC: metzpinguin
Priority: NOR    
Version: 8.3.0   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: rescan debug log

Description maderios 2023-12-15 12:57:24 UTC
Maintenance process is very slow when using mariadb external server
Scanning or rescanning image files takes a long time. 
I checked mariadb db with mariadb-check tool, it's ok.
This issue doesn't happen with sqlite.
There was no problem with Digikam version < 8.1 (may be 7.10)
Conditions:
Arch Linux system
mariadb 11.2.2
digikam git version
exiv2 0.28.1
Qt5 KF5 and/or Qt6 KF6 
With valgrind:
valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes --collect-jumps=yes digikam
With valgrind digikam launch is stalled at "initiating main view", CPU runs at 100%
See callgrind.out here :
https://www.dropbox.com/scl/fi/bdh11hg2t159sm04mhl95/callgrind.out.16513?rlkey=2ylgx8qbic1dxcey6pf3rujs7&dl=0
Comment 1 Maik Qualmann 2023-12-15 16:07:20 UTC
I haven't looked at the valgrind file yet, I'll do that late this evening. But just a thought, ExifTool has been on board since digiKam-8.0.0. So if you have a lot of files (videos or similar) or Exiv2 has an exception with your images, ExifTool will automatically read the metadata. This of course slows down the process. I definitely can't reproduce a drop in speed myself.

Maik
Comment 2 maderios 2023-12-15 16:20:22 UTC
I have only 2  video files (2 .mkv) in my test directory/album, other files are .jpeg and .png (225 files)
Comment 3 Maik Qualmann 2023-12-15 16:23:50 UTC
From other bug reports I know that you also have XCF files.

Maik
Comment 4 maderios 2023-12-15 20:06:30 UTC
(In reply to Maik Qualmann from comment #3)
> From other bug reports I know that you also have XCF files.
> 
> Maik


Yes, two XCF files in directory/album i used to test with valgrind today. They don't slow down but I deleted them. In an other account, I have no XCF files but big png files. I see scanning these big png  takes about from 300 ms to 1100 ms.
Comment 5 maderios 2023-12-15 20:10:25 UTC
(In reply to Maik Qualmann from comment #1)
> I haven't looked at the valgrind file yet, I'll do that late this evening.
> But just a thought, ExifTool has been on board since digiKam-8.0.0. So if
> you have a lot of files (videos or similar) or Exiv2 has an exception with
> your images, ExifTool will automatically read the metadata. This of course
> slows down the process. I definitely can't reproduce a drop in speed myself.
> 
> Maik

Did you try to reproduce it with big files, like png? They slow down process. It takes more one second to scan some of them.
Comment 6 Maik Qualmann 2023-12-15 22:06:29 UTC
Yes, they may be scanned with ExifTool, because PNG is definitely a candidate where Exiv2 fails. A debug log from the terminal can quickly clarify things.

Maik
Comment 7 maderios 2023-12-16 13:56:39 UTC
Created attachment 164223 [details]
rescan debug log

See maintenance rescan debug log  (about 400 images)
I use a fresh account for these tests.
Comment 8 Maik Qualmann 2023-12-16 15:19:32 UTC
In fact, ExifTool has to intervene a few times to read metadata, with times between 60-500-1100ms per image.
But what is noticeable is that images are processed twice. You also selected the “All Tags” option. Depending on the tag, this can actually lead to images being processed twice. I will change this situation.

Maik
Comment 9 maderios 2023-12-16 15:42:56 UTC
(In reply to Maik Qualmann from comment #8)
> In fact, ExifTool has to intervene a few times to read metadata, with times
> between 60-500-1100ms per image.
Not really "A few times":  if we say it takes only 200 ms per image, 20 000 images will take 4000 s to scan, more than one hour.
I think it's much more because I have many big files like png, xcf, .nef, with many sidecars/xmp tags (write to XMP sidecar only)

> But what is noticeable is that images are processed twice. You also selected
> the “All Tags” option. Depending on the tag, this can actually lead to
> images being processed twice. I will change this situation.
> 
> Maik
Comment 10 maderios 2023-12-16 16:14:57 UTC
I forgot to say i found other errors with an other digikam user account. I have many messages like this:
digikam.metaengine: Exiv2 ( 2 ) :  IPTC dataset Iptc.0x001c.0x0002 has invalid size 16640; skipped.
digikam.metaengine: Exiv2 ( 2 ) :  Failed to decode IPTC metadata.
I don't use IPTC
Comment 11 Maik Qualmann 2023-12-16 16:32:58 UTC
Git commit bc4a263fefbbf58cffd5ad55d50387e4acd9a4e6 by Maik Qualmann.
Committed on 16/12/2023 at 17:32.
Pushed by mqualmann into branch 'master'.

prevent double processed images when tags are also selected

M  +14   -2    core/utilities/maintenance/autotagsassignment.cpp
M  +14   -2    core/utilities/maintenance/fingerprintsgenerator.cpp
M  +14   -2    core/utilities/maintenance/imagequalitysorter.cpp
M  +15   -2    core/utilities/maintenance/thumbsgenerator.cpp

https://invent.kde.org/graphics/digikam/-/commit/bc4a263fefbbf58cffd5ad55d50387e4acd9a4e6
Comment 12 Maik Qualmann 2023-12-16 18:01:33 UTC
(In reply to maderios from comment #10)
> I forgot to say i found other errors with an other digikam user account. I
> have many messages like this:
> digikam.metaengine: Exiv2 ( 2 ) :  IPTC dataset Iptc.0x001c.0x0002 has
> invalid size 16640; skipped.
> digikam.metaengine: Exiv2 ( 2 ) :  Failed to decode IPTC metadata.
> I don't use IPTC

Exiv2 throws an exception because it detected unresolvable problems/errors in the images. In these cases we then read the metadata using ExifTool, which is a little slower. But it's still better than having incorrect metadata or no metadata at all.

Maik
Comment 13 Maik Qualmann 2023-12-16 18:09:36 UTC
I have a comparison here with the AppImage digiKam-8.0.0 (older AppImages unfortunately no longer work here) and my current Qt6 version shows no difference when creating thumbnails and fingerprints.

Maik