Bug 273077 - Generate Fingerprints gets slower and slower
Summary: Generate Fingerprints gets slower and slower
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Database-Similarity (show other bugs)
Version: 2.0.0
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-05-11 22:21 UTC by Andi Clemens
Modified: 2017-07-26 17:18 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 2.0.0


Attachments
massif #1 (52.45 KB, application/octet-stream)
2011-06-22 00:37 UTC, Andi Clemens
Details
massif #2 (109.50 KB, application/octet-stream)
2011-06-22 00:38 UTC, Andi Clemens
Details
callgrind (266.64 KB, application/x-gzip)
2011-06-23 19:38 UTC, Andi Clemens
Details
callgrind snapshot (182.31 KB, image/png)
2011-07-25 18:25 UTC, Jaime Torres
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andi Clemens 2011-05-11 22:21:54 UTC
Version:           2.0.0 (using KDE 4.6.2) 
OS:                Linux

When I generate fingerprints for 50.000 images, it is really fast in the beginning, but it gets slower and slower during calculation.
One image takes around 0.5 seconds to calculate, but after a while it takes 2 seconds and more.
If I cancel the operation, restart digiKam and scan for missing fingerprints, the same files that took over 2 seconds to calculate are handled in 0.5 seconds again.

So it must have something to do with the amount of images that have been already scanned / calculated. Maybe we are using a cache somewhere that gets bigger and bigger and therefore slower to scan?
After a digiKam restart it is fast again, so it could be related to a hashmap or vector that is growing fast.

Reproducible: Didn't try




OS: Linux (i686) release 2.6.38-ARCH
Compiler: gcc
Comment 1 Marcel Wiesweg 2011-05-12 22:32:19 UTC
Do you see substantial growth in memory usage? Then massif can give us a hint.

My first suspect was pixmap generation and the progress dialog, but I think this problem was fixed already. To rule out a problem here, disable the relevant lines 183 and 184.

There is only a QStringList in the fingerprints generator, I doubt that is the problem.

Second step is the HaarIface. To rule out any problems here, you could try to use a fresh HaarIface object any time.

If all does not help - though it takes longer - we'd need to run digikam under callgrind, and switch on instrumentation as soon the presumed bug's calculation time is significant, like > 50%.
Comment 2 Andi Clemens 2011-05-22 18:19:24 UTC
I tried to profile this, but somehow valgrind doesn't seem to get information about running kioslaves.
Any tip how to profile an kioslave?
Comment 3 Marcel Wiesweg 2011-05-23 20:20:31 UTC
There is a hack to run an ioslave under gdb, but it's so complicated I never tried.
Then, which ioslave is eating CPU? There only so much tasks done in an ioslave.
Comment 4 Francesco Riosa 2011-05-24 22:19:40 UTC
I was here today ;-)
http://techbase.kde.org/Development/Tutorials/Debugging/Debugging_IOSlaves

could work
Comment 5 Andi Clemens 2011-06-22 00:37:20 UTC
(In reply to comment #1)
> Do you see substantial growth in memory usage? Then massif can give us a hint.
> 
> My first suspect was pixmap generation and the progress dialog, but I think
> this problem was fixed already. To rule out a problem here, disable the
> relevant lines 183 and 184.
> 
> There is only a QStringList in the fingerprints generator, I doubt that is the
> problem.
> 
> Second step is the HaarIface. To rule out any problems here, you could try to
> use a fresh HaarIface object any time.
> 
> If all does not help - though it takes longer - we'd need to run digikam under
> callgrind, and switch on instrumentation as soon the presumed bug's calculation
> time is significant, like > 50%.

Memory consumption raises continuously, I disabled the fingerprint generation and the pixmap generation and only left the loadingdescriptionthread acive. Still the same result.
So I guess something is wrong in there?
I ran it through massif for 2 hours now, but actually I don't quite understand the results massif gives me.
I'll attach them to this thread, maybe someone can extract vital information.

Andi
Comment 6 Andi Clemens 2011-06-22 00:37:59 UTC
Created attachment 61227 [details]
massif #1
Comment 7 Andi Clemens 2011-06-22 00:38:21 UTC
Created attachment 61228 [details]
massif #2
Comment 8 Marcel Wiesweg 2011-06-23 13:19:09 UTC
You will want to use "massif-visualizer" to analyse these logs.
The first log has a peak memory usage of 181,2 MB, the second ends at 6,8 MB. Both is not much, with today's images of 15 MPx easily taking more than 100 MB of memory.
If there is a memory leak, I would expect a steadily increasing memory usage, but looking at the first log here, there is a plateau which usually rules out any leak.
You wrote above memory usage was increasing - which number did the system monitor give you? I think the KDE system monitor gives a pretty good number for used memory (which is not trivial)
Comment 9 Marcel Wiesweg 2011-06-23 13:24:02 UTC
There is another possibility: May be there is no memory leaks, but some structure is increasing in size (still small memory-wise) but is hugging CPU.
To find that, "callgrind" will help. It is key to start callgrind without measurements (digikam runs at almost normal speed), provoke the bug, and when the CPU usage is high, enable instrumentation:
valgrind --tool=callgrind --instr-atstart=no digikam
(make CPU usage go up)
callgrind_control -i on
(collect one or two minutes of CPU)
callgrind_control -k
(kills digikam, dumps output)
Output is to be analysed with kcachegrind
Comment 10 Andi Clemens 2011-06-23 16:08:54 UTC
The memory increased by 1MB each second, I watched it with the KDE system monitor.
I will try callgrind again, maybe these outputs will be more helpful.
Comment 11 Andi Clemens 2011-06-23 19:36:16 UTC
The only thing I can see which takes a long time is the following function:
ClassicLoadingCacheFileWatch::slotUpdateDirWatch()

and here especially line 473.

I'll attach the callgrind files...
Comment 12 Andi Clemens 2011-06-23 19:38:35 UTC
Created attachment 61285 [details]
callgrind
Comment 13 caulier.gilles 2011-06-23 19:49:38 UTC
Andi,

I also see a lot of KDirWatch report on the console which slow down digiKAm on my laptop.

Run kdebugdialog, trun on right kdelibs debug space, and run image editor. Export image as a new file, especially a huge photo in PNG (large file). Now look how KDirwatch fire changes in HDD when image is written on disk. For me, it's abnormal. In this case, KDirwatch must be paused when file is written.

It's possible that DB file bring up KDirWhatch when fingerprint are registered.

What's debug trace give exactly about KDirWatch ?

Gilles Caulier
Comment 14 caulier.gilles 2011-06-23 22:34:51 UTC
Andi,

I just tested Fingerprint registration in DB under Windows. It work like a charm. No crash, no memory leak. Windows memory control center from Win7 said that memory allocation is stable to 67% (3Gb). 2250 RAW/JPG are managed. digiKam is compiled with MSVC 2008.

The only thing to improve here is CPU load. Only 58% of CPU is used. It's a double core computer. This want mean that only one CPU is used. I'm sure that we can use OpenMP here to speed up computation (MSVC support it very well, as GCC)

Gilles Caulier
Comment 15 Andi Clemens 2011-06-24 08:07:54 UTC
I just tested it again, it gets clearly slower after 2 minutes and memory usage went from 180MB to 287MB.
There is something growing in memory, and it seems to stay in this way when I stop the fingerprints generator and re-activate it again after 5 minutes. So there is something else (not related to fingerprints) that stays in memory. 
Only a digiKam restart fixes the memory and speed issue.

I still think it has something to do with this method:
ClassicLoadingCacheFileWatch::slotUpdateDirWatch()

There is a list that seems to grow all the time...?
Comment 16 Marcel Wiesweg 2011-07-14 16:54:31 UTC
For Haar we load very small images if size 128 and put them in the cache. With the cache size chosen on my machine, this means more than 4000 entries fit in the cache - for normal-size images, it is only a few. Apparently KDirWatch, its backends and/or our code for going through the lists get slow then.

One question that occurred to me is why dont we use the thumbnail loader, which provides up to 256, for loading images for fingerprint generation?
Comment 17 caulier.gilles 2011-07-14 18:22:47 UTC
Marcel,

no idea why thumbnail loader is not user here. I agree that it must do it.

Gilles Caulier
Comment 18 Jaime Torres 2011-07-25 18:25:09 UTC
Created attachment 62186 [details]
callgrind snapshot

I guess the problem is in the data structures.
I've run digikam under callgrind doing an update of the firgerprints, and you can see where the time is spent in the screenshot: Searching in a probably growing list.
Comment 19 Andi Clemens 2011-07-25 21:30:47 UTC
Git commit 9e0baf9f6216942c1664337a8d4b9f584f1b4989 by Andi Clemens.
Committed on 25/07/2011 at 23:29.
Pushed by aclemens into branch 'master'.

Possible fix: Use a QSet instead of a QStringList. This container provides a
much faster lookup, at least on my system it works quite fine.

CCBUG:273077

M  +3    -2    libs/threadimageio/loadingcache.h
M  +11   -11   libs/threadimageio/loadingcache.cpp

http://commits.kde.org/digikam/9e0baf9f6216942c1664337a8d4b9f584f1b4989
Comment 20 Andi Clemens 2011-07-25 21:42:12 UTC
It is definitely faster now, I just rebuilt all my fingerprints and I had a constant speed. With the stringlist version it was barely usable after 3 minutes.