I have a large collection (13000) of files that I'm trying to rename. However, it takes about 5 seconds per image to rename each file. This is unreasonable, since I am running this job on an SSD with a reasonably fast processor. iotop shows digikam at 0% IO, and digikam is using 2/4 of my cores almost completely. When I run this rename under digikam, I get a whole bunch of the following for each file renamed: statx(AT_FDCWD, "/home/user/tmp/recoveredpics/2010-02-04T17:05:44.JPG", AT_STATX_SYNC_AS_STAT, STATX_ALL, {stx_mask=STATX_ALL, stx_attributes=0, stx_mode=S_IFREG|0777, stx_size=2186973, ...}) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3545, ...}) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3545, ...}) = 0 stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3545, ...}) = 0 The rename "preview" is also slow, but it's acceptable, and I don't think there's room for (or point to) improvement there.
So 5 seconds is really very long. DigiKam not only rename the files but also has to delete the thumbnail and create a new one, etc. Here there are certainly still optimization possibilities, for example just only refresh the path in the thumbnail DB. Also, the current serial rename process to parallization is an option. Nevertheless, I have never observed 5 seconds, are the databases also on an SSD? Maik
> are the databases also on an SSD? Yes Actually, it appears that sometimes several files will be renamed very quickly. But then the next file will take a while. Another thing: if I move just a few of my 13000 photos and videos into another directory and rename them there, the process finishes instantly. This makes me think that the problem is with one directory containing many files. I've attached some exerts from the digikam log, as well as a flamegraph from running digikam under perf (this contains stuff from startup too).
Created attachment 112079 [details] exerpts from log when renaming images
Created attachment 112080 [details] overall flame graph from startup to while renaming images
Git commit fee46b9602a1dfe4f9ed003529dce84dc3646901 by Maik Qualmann. Committed on 18/04/2018 at 06:08. Pushed by mqualmann into branch 'master'. not the thumbnail loading thread determines the clock when renaming M +8 -19 core/utilities/advancedrename/advancedrenameprocessdialog.cpp https://commits.kde.org/digikam/fee46b9602a1dfe4f9ed003529dce84dc3646901
A question, the folder in which the files are renamed contain 13000 files? If so, the cause is the QFileSystemWatcher, which is supposed to only trigger external events. Perform a permanent rescan of the huge collection here. Maik
Git commit 8dad5fb6ccbee42d75afa79b2c1e7b6a78e5cdd1 by Maik Qualmann. Committed on 18/04/2018 at 10:32. Pushed by mqualmann into branch 'master'. use a very long timer for external events for the collection scan M +1 -1 core/libs/album/albumwatch.cpp M +24 -0 core/libs/database/utils/scancontroller.cpp M +10 -0 core/libs/database/utils/scancontroller.h https://commits.kde.org/digikam/8dad5fb6ccbee42d75afa79b2c1e7b6a78e5cdd1
It would be nice if Gilles updated the AppImage for digiKam-6.0.0 you would do a test. Maik
Hi Maik, I will update the bundles this evening... Gilles
new AppImages are uploaded... https://files.kde.org/digikam/ Gilles
I thought I'd replied to this, but I guess not: >A question, the folder in which the files are renamed contain 13000 files? That is correct. I ran into some kind of issue with digikam's image recognition database earlier, so I haven't been able to test this patch. Latest master (04bc525) doesn't have the database migration issue, but renaming files is still slow. I tried changing 'd->externalTimer->setInterval(2000);' (see 8dad5fb) to '200000' to fit the spirit of "very long timer" a little better (since setInterval is in milliseconds), but that doesn't help.
A test here with external local MySQL DB on an SSD and 1500 images on a HDD gives the following result with a simple counter renaming: krename: 3:00 minutes digiKam: 3:30 minutes You notice that the more files in the directory, the renaming becomes slower and slower. Considering that digiKam deletes the old thumbnail in addition to renaming, renames the file in the database, and creates a preview thumbnail, I think it can not get any faster. It almost seems to me that the more files in a directory, the slower the access to the file system. Maik
1500 photos are not enough to reproduce this bug. I did an experiment where I used krename and digikam to rename some files with the EXIF modify date and a counter. Here are the results with both the images and the database on an SSD: # | krename | digikam | 1001 | 6s | 34s | 5001 | 64s | 21% done at 6040s | I don't believe this is a "digikam is 5x slower than krename" issue, it seems more like krename takes n time, and digikam takes n^2 time.
Git commit d765c17ce96706c5f6a1116ffe3d898100aa47b0 by Maik Qualmann. Committed on 22/05/2018 at 20:58. Pushed by mqualmann into branch 'master'. use static function QFile::exist() is a little bit faster M +2 -3 core/libs/iojobs/iojob.cpp https://commits.kde.org/digikam/d765c17ce96706c5f6a1116ffe3d898100aa47b0
The problem is detected, it is also related to Bug 391115. The icon view takes more and more time, the more elements are displayed. The QListView and QSortFilterProxyModel gets slower as you add or remove items. Moving (renaming) in the database initiates this process. At the moment I do not see that there is a mistake in the implementation or there is a possibility for us to accelerate QListView. I have a patch here that dramatically speeds up the renaming process. I rename the ImageInfo, also in the ImageInfo cache, in the database and in the thumbnail database. Therefore, the item in the QListView does not need to be removed and added with a new name. The patch has to be tested a bit... Maik
Git commit 24ec6a3bdcf7f4b5ed50ecd38bd9bc7682649ded by Maik Qualmann. Committed on 28/05/2018 at 20:26. Pushed by mqualmann into branch 'master'. speed up renaming the image and rename the thumbnail path Related: bug 382474 M +7 -0 core/libs/database/coredb/coredb.cpp M +6 -0 core/libs/database/coredb/coredb.h M +14 -0 core/libs/database/item/imageinfo.cpp M +6 -0 core/libs/database/item/imageinfo.h M +7 -0 core/libs/database/thumbsdb/thumbsdb.cpp M +2 -0 core/libs/database/thumbsdb/thumbsdb.h M +22 -14 core/libs/database/utils/dio.cpp https://commits.kde.org/digikam/24ec6a3bdcf7f4b5ed50ecd38bd9bc7682649ded
I get huge improvements with that new code, thank you! In the time it took me to write this comment, Digikam is at 25% renaming my 13000 image collection. It could still be faster (I don't think I'm ever happy with the performance of anything), but I think it's a good time to call this bug fixed. Just for fun, I've attached another flame graph. About 50% of the time is now being spent loading the thumbnail, another 10% in __tzset, and another 10% in software aes (I really need to get a cpu with aes-ni :)).
Created attachment 112936 [details] new flamegraph after the patch was applied
Comment on attachment 112936 [details] new flamegraph after the patch was applied Which profiling tool did you use to capture cpu idle with digiKam ? Which modifications did you introduce in compilation workflow of digiKam to achieve this profiling ? Gilles Caulier
I leave the bug still open, that rename at override does not update the Icon view correctly. I want to fix that first. Disabling the thumbnail in the dialog box would not make it any better. At the moment, renaming is a single core job, but whether it would really do anything to split it into multiple jobs is questionable. Here digiKam is faster than krename. Maik
Git commit 35e125ac8445e9e77db75e85aa0d4e53ab7a0e36 by Maik Qualmann. Committed on 29/05/2018 at 16:45. Pushed by mqualmann into branch 'master'. fix rename by overwrite Related: bug 382474 FIXED-IN: 6.0.0 M +3 -1 NEWS M +1 -0 core/libs/album/albummanager.cpp M +3 -3 core/libs/database/coredb/coredb.cpp M +1 -1 core/libs/database/coredb/coredb.h M +1 -1 core/libs/database/item/imageinfo.cpp M +1 -0 core/libs/database/models/imagelistmodel.cpp M +11 -1 core/libs/database/utils/dio.cpp M +1 -0 core/libs/models/imagealbummodel.cpp M +1 -0 core/utilities/queuemanager/views/queuelist.cpp https://commits.kde.org/digikam/35e125ac8445e9e77db75e85aa0d4e53ab7a0e36
(In reply to caulier.gilles from comment #19) > Comment on attachment 112936 [details] > new flamegraph after the patch was applied > > Which profiling tool did you use to capture cpu idle with digiKam ? I used perf, but I wouldn't trust the cpu idle info to be accurate. For details on how to use perf, there's Brendan Gregg's excellent guide at http://www.brendangregg.com/perf.html#OneLiners. The tool I used to generate the flame graph can be found at http://www.brendangregg.com/perf.html#FlameGraphs. > Which modifications did you introduce in compilation workflow of digiKam to > achieve this profiling ? You must compile with debug information to get accurate stack traces, but that's it. > Gilles Caulier
Flaviu, Thanks for the feedback. The links given are very instructive/useful. Best Gilles Caulier