Bug 229888 - Speed up "Scan for new items at startup"
Summary: Speed up "Scan for new items at startup"
Status: RESOLVED WORKSFORME
Alias: None
Product: digikam
Classification: Applications
Component: Database-Scan (show other bugs)
Version: 1.1.0
Platform: Debian testing Linux
: NOR wishlist
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-03-07 22:29 UTC by Vlado Plaga
Modified: 2017-07-25 10:50 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 2.5.0


Attachments
Script to make digiKam startup faster (1.62 KB, application/x-sh)
2010-03-12 18:31 UTC, Vlado Plaga
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vlado Plaga 2010-03-07 22:29:23 UTC
Version:           1.1.0 (using KDE 4.3.4)
OS:                Linux
Installed from:    Debian testing/unstable Packages

Since I sometimes add pictures to the digiKam directory tree without a running digiKam I need the option "Scan for new items at startup (makes startup slower.)" activated. Indeed this makes startup a lot slower than it normally is. I don't know what digiKam is doing internally, but I'm sure just checking for files newer than the last digiKam start could be done much faster.

Rough estimates:
- digiKam "Scanning images in individual albums" takes about 20 seconds on my computer (with about 13,000 pictures).
- "find . -anewer ~/.kde/share/digikam/thumbnails-digikam.db" for the digiKam root directory takes about 4 seconds.

This "-anewer" even finds files that have an older modification date, if they were moved from some other place. Anyway the directory modification date also changes in such cases, possibly giving digiKam a hint where to look closer for new pictures.
Comment 1 Marcel Wiesweg 2010-03-08 19:59:33 UTC
For my 30000 pictures, the scan time is only a few secs. Only good profiling results with cachegrind or OProfile can help us here.
There were also reports about problems with sqlite on ext4, but I am also running ext4, so it cannot be a general problem.
Comment 2 Andi Clemens 2010-03-08 20:03:49 UTC
Marcel,

I suppose you have barriers turned off, because otherwise you would have noticed a huge difference ;-) But the same is true for XFS and btrfs... all filesystems with barriers suffer from the performance loss.
Comment 3 Vlado Plaga 2010-03-08 22:33:18 UTC
But couldn't digiKam just use the system's "find" command to check whether there are any new pictures? In a most primitive implementation digiKam would not even need to use the find results, but just start doing its own search for new pictures only in case "find" finds something (which at least on my computer would not be true in most cases anyway).

I don't know what "barriers" in file systems are, but I've got my pictures on a hfsplus file system, so I can also access them from Mac OS X. My computer is a 2003 model, iMac G4 1GHz (with 1.5 GByte RAM and a relatively new 2.5" internal hard drive).

Here are again some digiKam startup times on my system:

30 seconds (1st start, not scanning for new items)
45 seconds (2nd start, scanning for new items)
20 seconds (3rd start, not scanning for new items)
40 seconds (4th start, scanning for new items)

So it is quite obvious that scanning for new items consumes about 20 seconds, which can be about 50% of the overall startup time, if digiKam had been loaded before. "find", as I wrote before, initially took just about 4 seconds scanning through my 13,000 pictures, and takes less than 1 second now that digiKam scanned through the pictures (files) already...
Comment 4 Marcel Wiesweg 2010-03-09 19:25:56 UTC
Any approach will involve reading (stat'ing) the directories of the collections, and not more. Like calling "find .".
Digikam gets the modification date of the files in a directory from its database and compares these.
When find . is fast for you, it's more probably a database speed issue on your system and filesystem. But we cannot be sure without profiling.
Comment 5 Vlado Plaga 2010-03-12 18:31:16 UTC
Created attachment 41573 [details]
Script to make digiKam startup faster
Comment 6 Vlado Plaga 2010-03-12 18:33:59 UTC
I did some additional tests, even copied my pictures and database to ext3 partitions, but the scan still takes about 20 seconds. I guess this is just because the G4 really is a rather slow CPU (if its "AltiVec" instructions are not used). I also checked the startup speed on my AMD64 notebook computer for comparison, and just as you, Marcel, wrote it, the scan just takes a few seconds, about two, actually (that's with digiKam beta-5 in Kubuntu).

DigiKam startup under MacOS (with digiKam 1.0-rc) was even slower than on Linux:
99 seconds (including 29 seconds scanning for new images) on the first start, and 60 seconds (including 22 seconds scanning for new images) when I shut it down and started it again.

But is digiKam checking the file dates against different dates from its database? If yes, why doesn't it just take the last startup date as a reference? It seems like digiKam always writes its config file, so that file date could simply be used as a reference.

For my purposes I wrote a shell script that makes digiKam only check for new images if "find" finds new images. I'll attach that. Maybe someone else with an old computer can compare the find command's performance to the normal digiKam scan duration. I'm sorry the attachment was committed to this bug before the comment.
Comment 7 caulier.gilles 2011-12-15 09:42:01 UTC
Vlado,

This script still valid using digiKam 2.x Database schema ?

Gilles Caulier
Comment 8 Vlado Plaga 2011-12-15 22:58:39 UTC
Sorry, Gilles, but I'm not using that old PowerPC computer any more (at least not with digiKam). My new computer is an AMD Fusion E-350 - still quite slow (and also my 5000 RPM hard drive), but repeated starts take just about 15 seconds, no matter whether it has to scan for images or not. At most the scan takes about 2 additional seconds. Currently I'm also using the MySQL database, in case that makes a difference.