Bug 374608 - KphotoAlbum is not multithreaded
Summary: KphotoAlbum is not multithreaded
Status: RESOLVED FIXED
Alias: None
Product: kphotoalbum
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: KPhotoAlbum Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-01-05 20:30 UTC by Max V
Modified: 2019-01-03 20:31 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Screenshot of cpu not used (129.41 KB, image/png)
2017-01-05 20:30 UTC, Max V
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Max V 2017-01-05 20:30:24 UTC
Created attachment 103219 [details]
Screenshot of cpu not used

Hello,
I started to use kphotAlbum with my large photo collection, after 2 hours it reached just the 20% of my collection on my high speed 8 cores computer.
I launched the system monitors (see attached image) and I noticed that quite all cores was sleeping instead of scanning files.
Why don't you use the "parallel" command for scanning? See https://www.gnu.org/software/parallel/ , it's a simple open source command that permit to speed up executions.
I use it with all my softwares that works with many files, it speed up process incredibly.
Comment 1 Johannes Zarl-Zierl 2017-01-06 00:17:57 UTC
The screenshot you posted shows that not a single core of your system is near 100%. This means that your real problem is filesystem performance.

Using more cores for the scanning would make things worse in your specific case.

How many image files are we talking about? What kind of storage do you use?

Reading the meta-data from all your files means that kphotoalbum (and any other comparable program) needs to read every single image file (often in a non-sequential way). If you run "iotop" as root, you should see that your disk IO is busy.

For reference: On my (not quite as powerful) computer, deleting my database and rebuilding it from scratch causes the a single core to run at 5% of its capacity, while the disk subsystem has an IO load of ~ 60% (6-8Mb/s).
Comment 2 Tobias Leupold 2017-01-06 11:40:22 UTC
Apart from Johannes' comment, just about multithreading in general:

Multithreading is not as simple as one might think ... I tried to implement multithreading for face recognition back then ™ and came down to the fact that every single chain link involved in this process has to be designed for being capable for multithreading.

I never really thought about making KPA multithreaded as a whole, but I think one would have to rewrite big parts of the whole implementation to do that.

After all, scanning a huge collection completely is not a thing one does every day. Most of the time, just a bunch of photos is added, isn't it? And that's a process that, even on my old low-performance machine, lasts just a few seconds, although my collection is located on my NAS over NFS.

No hard feelings! Just saying ... ;-)
Comment 3 Max V 2017-01-09 09:28:08 UTC
Hi, 
I suggest to try to use the Linux "parallel" command before the command to scan the images and see what happens. Like here: https://www.biostars.org/p/63816/
In my scripts to modify images usually this way speeded up the process 8 times and more.
KphotoAlbum scanning took more than 6 hours, for just 17.7 GB (59'000 pictures). New users will be scared or just not use at all it.
Does it scan also hidden folder and images (those starting with dot ".")?
Comment 4 Johannes Zarl-Zierl 2019-01-03 20:31:55 UTC
I still think that your problem is IO of your system. Either way, we have greatly improved the performance for reading new images in version 5.4 (and yes, we do use multithreading). I think this can be closed...