Summary: | Changing elements order takes near 5 minutes in a folder with 50 images | ||
---|---|---|---|
Product: | [Applications] digikam | Reporter: | Rafael Linux User <rafael.linux.user> |
Component: | Albums-ItemsSort | Assignee: | Digikam Developers <digikam-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | caulier.gilles, jens-bugs.kde.org, metzpinguin, rafael.linux.user |
Priority: | NOR | ||
Version: | 5.9.0 | ||
Target Milestone: | --- | ||
Platform: | openSUSE | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | 7.2.0 | |
Sentry Crash Report: | |||
Attachments: |
Wasted area in red square
Bash script to create 390k hardlinks New script (2020) to create 4 folders with 10k distinct images each one |
Description
Rafael Linux User
2018-07-02 10:53:43 UTC
No, it's not normal that sorting takes so long for 50 images in the album. It also has nothing to do with the database, in principle. Can you download and test the AppImage from www.digikam.org? AppImage does not install anything in the system, only needs to be provided with execute rights. Maik I'll try it tomorrow. Is expected some change (corruption or similar) when linking to the actual internal mysql database? Yes, we make changes to the database in digiKam-6.0.0. So it's better you make a backup of your internal MySQL folder. Maik I have a question about the backup ... being an "internal mysql", how to backup my database? I was thinking to use "mysqldump" but I realized that I don't know user/password .... I didn't find nothing about "Internal mysql" in the wiki. Please, any help? The internal mysql databases are all located at the same place in a hidden directory that you setup from DK config dialog. Just copy all the contents to a backup drive and that all. Gilles Caulier Thank you Caullier, I found the folder. I'm sorry, but user was too busy to let me try the AppImage today, cause - for reasons I don't know - he lost remain files from the camera after he made a "Delete selected" just after he imported that selected files from Digikam Import. He needed to recover them with PhotoRec. I'll try sorting tomorrow. Regards Bad news. Version 6 didn't change the bad behaviour about sortening. No changes at all at this. Same folder (or any other one) with less than 50 jpeg files of about 7MiB each one. Any type of reordering, took more than 2m30s. :( Wait a minute, I currently rebuild the 6.0.0 pre-release AppImage bundle. It will be only in one hour. Please try again with this version. Gilles Caulier I currently have no explanation why it takes so long. An album with 500 images is the time not measurable, the sorting is done immediately. A virtual album with 20,000 images in the view needs to sort by date about 1-2 seconds, sort by name about 5-6 seconds. And there are significantly more powerful computers than my double core... Maik Maybe is related to that is a 370K photos database. Anyway, latest bundle doesn't change anything about this. It took near 4 minutes in reordering 13 photos!!! This time I was watching the system load thru "htop", to see if the problem was in mysql ... but it's not the guilty. Is Digikam how is eating 99% CPU (I forgot to say that Digikam is unusable while reordering). It's not a poor CPU power issue, cause is an Intel Core i5-6500 and 8GiB RAM. Let me know if I can help you with more info. I forgot to say that launch Digikam takes about 5 minutes too, and is Digikam who is eating CPU (at 100%). It does not matter how many images are in the database when the 50 images album is already displayed. There are only the 50 images in the already loaded image model sorted. The CPU usage of MySQL is more relevant. Is it possible that digiKam still creates thumbnails in the background? Or not yet captured all the images and albums. This can take some time, even hours ... until digiKam is ready and fully available. Is a running progress bar displayed in the status bar? Maik > loaded image model sorted. The CPU usage of MySQL is more relevant. Is it > possible that digiKam still creates thumbnails in the background? Or not yet > captured all the images and albums. This can take some time, even hours ... > until digiKam is ready and fully available. Is not the case. All the images (we are talking about 13 jpg images) were processed by Digikam before changing ordering. > Is a running progress bar displayed in the status bar? No, no running progress bar was showed before reordering elements. Created attachment 113800 [details]
Wasted area in red square
Wasted area inside red square.
Great!!. Issue solved. Thank you. I noticed (maybe you know it) that in this beta appears a wasted area while showing thumbsnails (or not), like the attached screenshot. Did you notice that? yes, me i see it recently, with AppImage only. I don't know why, perhaps a side effect with Qt 5.9.6 LTS used to compile whole AppImage contents. In all case this space is the same than thumb-bar, even if this one is not show in icon view mode. A weird bug. Go to this area, and press right mouse button. An empty pup-up menu (with a checkbox inside) appears. Select it and the empty space must be replaced by the thumb-bar. Incredible no ? Gilles Caulier I imagined something like that. I wish this issue will dissapear in final release ;) Git commit 967a93ee109a9e16f2f565d7738e370fbd37ecc1 by Maik Qualmann. Committed on 06/07/2018 at 17:21. Pushed by mqualmann into branch 'master'. this could fix the problem with the thumb-bar M +1 -1 core/app/views/stackedview.cpp https://commits.kde.org/digikam/967a93ee109a9e16f2f565d7738e370fbd37ecc1 Maik, No passing the parent to thumbar dock is not enough. Try the last AppImage to see the effect. Gilles It is interesting that my compiled version with Qt-5.11 also has this empty CheckBox, if you click with the right mouse button on the narrow area where you can move the dock-bar. Maik The CheckBox is normal and comes from here QDockWidget::toggleViewAction(). Maybe we should set a title so that the QAction has a name. Maik So, you suspect that changes from Qt introduce this dysfunction ? Remember that KF5 have been updated to last stable version in the bundle. So, as all Mainview use KMainWindow class as parent, perhaps something has changed to manage layouts ? Gilles Git commit 0db966159ee98f7866b6a5ebbdabe2f3b059de75 by Maik Qualmann. Committed on 07/07/2018 at 12:13. Pushed by mqualmann into branch 'master'. try to fix AppImage dock-bar problem M +1 -1 core/app/views/stackedview.cpp https://commits.kde.org/digikam/0db966159ee98f7866b6a5ebbdabe2f3b059de75 Please, don't forget the main title (that is what really matters XD ). I shouldn't comment the beta bug in the same thread, sorry. My fault. ;) Maik, It's not yet fixed. Just built new AppImage bundle has always the problem with thumbbar Gilles Maik, I renamed ~/.config/digikamrc file as *.old, restarted digiKam from scratch, and the problem disapear with last 6.0.0 AppImage bundle. So, it's look again a problem with GUI state storage in rc file or something like that. gilles Git commit a3104006d9047ccc338b7eb3d40a975f7059c0a9 by Maik Qualmann. Committed on 05/08/2018 at 16:34. Pushed by mqualmann into branch 'master'. check if group info is already in the cache Related: bug 397110 M +20 -3 core/libs/database/item/imageinfo.cpp https://commits.kde.org/digikam/a3104006d9047ccc338b7eb3d40a975f7059c0a9 Git commit 845a33a522e044949852589bf0c35cb577ae90da by Maik Qualmann. Committed on 05/08/2018 at 16:48. Pushed by mqualmann into branch 'master'. check if tags info is already in the cache Related: bug 397110 M +18 -3 core/libs/database/item/imageinfo.cpp https://commits.kde.org/digikam/845a33a522e044949852589bf0c35cb577ae90da FYI: The wasted space in the thumbnails view still exists in the current appimage as of yesterday (2018-08-19). ... and it vanishes in thumbnail view when I resize the preview area (height) in the single photo view. Unfortunately, this setting is not kept ... but it's a workaround and it might help tracking this issue down. And, as I said, the problem because I opened this bug still exist. Do anyone have a database with +300.000 photos to confirm this issue? No, but to reproduce you could write a script to hardlink a single photo 300.000 times, and then point a new fresh digikam installation (e.g. a separate user account) to this folder structure. If you have such a script, I will be happy to test it on my hardware. I did the script. I'm not sure about this "plain" and homogeneous source of photos (all are equal, folder names are not complex, there are no nested folders ... too simple) will do the trick, but we can try and I appreciate your help. First you need, is a jpg photo named "photo.jpg" in the same folder that the script. Then, you can execute (previously, you should make it executable) the file. It will create 390.000 hardlinks. This hardlink grouped in 65000 links, are linked to 6 photos copied from your first one in each 6 folder. Each folder will have finally 65.000 hardlinks. Now I'll try to attach the link. If that doesn't work, I'll copy here the script. Created attachment 114531 [details]
Bash script to create 390k hardlinks
The script will create 6 folders each one with 65k hardlinks to one copy of photo.jpg copied to each folder.
Requirements:
- A jpg file renamed to "photo.jpg"
- The script in the same folder
Didn't work at first, but after changing {1..65000} to $(seq -w 1 65000), it worked fine. First startup with initial scanning of folders was done in ~10 seconds. Scanning of images - after startup - with progress bar at the bottom of the main Digikam window progressed with ~100 images per second, as I could see in the console where I started the Digikam appimge. Scanning slowed down until at "FolderB", so after roughly 65000 images, it took 0.5s per image (!) - so I won't finish scanning all 300'000 images today. Will keep you updated. The problem with this test is that all images are the same. DigiKam recognizes identical images in the database, which means that the number of duplicate images is constantly increasing and the database returns a larger search result for each new image. In addition, even with such an amount of images, these should be spread over 500-1000 albums, so that it is realistic. Maik The scan process in DK is divided in 2 parts which are bottlenecks: 1/ scan of albums in recursive mode without contents inside. This populate few database tables. 2/ scan album contents listed in DB. This include all files registered in mimetype to support (photo and video). Other tables in database are populated. Depending of collection sizes, both can take a while, but 2/ is always the most important. 1/ can be certainly parallelized, but in fact all is serialized in database. Even if we use multicore here to list directory, the gain will be minor. 2/ can be parallelized. The most import component used to populated the database with items properties is Exiv2 for photo, and ffmpeg for video. Even if Exiv2 support multicore, Exiv2 use memory and is not optimum. If an improvement must be done it's here. So it's an UPSTREAM problem, already reported to Exiv2 team, but i never seen an improvements in this area. In all cases, if we parallelize metadata parsing, the database serialization will limit the gain, excepted if the registration of items in database is done by chuncks of items, and not one by one. And even if we chunck registration of items, i'm not sure if the gain will be visible with SQlite. Certainly it will be better with Mysql/Mariadb, especially with a remote server. VoilĂ for few explainations of DK DB scannner. To resume, as Maik said, the one image linked plenty of time is not a valid test. The only parts which can be tested like this is 1/ Gilles Caulier The 99% of CPU is probably the auto-completion from album filter on bottom of tree-view. This problem have been already reported in bugzilla, few code fixed by Maik, but, internally, algorithm from Qt5 are used and the complexity is so far to be perfect. Perhaps this problem is fixed with more recent Qt5 implementation. AppImage bundle use Qt5.9.6 LTS, not the last one Qt 5.11.1. Maik, if you have more details on this parts... Gilles Caulier (In reply to Jens from comment #35) > Didn't work at first, but after changing {1..65000} to $(seq -w 1 65000), it > worked fine. > > First startup with initial scanning of folders was done in ~10 seconds. > Scanning of images - after startup - with progress bar at the bottom of the > main Digikam window progressed with ~100 images per second, as I could see > in the console where I started the Digikam appimge. > > Scanning slowed down until at "FolderB", so after roughly 65000 images, it > took 0.5s per image (!) - so I won't finish scanning all 300'000 images > today. Will keep you updated. For me (in bash of OpenSUSE) that parameter for the loop worked. What's your o.s.? Anyway, I did (better, I'm doing) the test, but after more than 28h, it didn't finish. There is no delay if I close and launch again Digikam, but today it only have (I guess) scanned completely "FolderA" and "FolderB", half of "FolderC" (29029 files) and 1580 files of "FolderD". "FolderF" is not showed. Doesn't appear (but it exist in Dolphin). But the worst is that Digikam doesn't show any album content (no thumbails) despite it's showing the images counter at the end of each album. Meanwhile, Digikam is taking one CPU core to 100%. What about your experiencie, Jen? I can't see the images in the folders (but they do exist in the filessytem) and the scanning process is still at 17% with about 1 image per half second. If the duplicate scanning takes so much time because the images are all the same, will it work better if I create a huge amount of small random JPEG files that are all different in content? Or does the duplicate scanner only take metadata int account? Git commit f8d8dc6ebdbbb0f75561a6a4dc6a0a95d728ca42 by Maik Qualmann. Committed on 24/08/2018 at 20:00. Pushed by mqualmann into branch 'master'. store the number of childs in the album this commit reduces the start of digiKam here with 11000 albums from 1:30 minutes to 1:15 minutes Related: bug 368468 M +11 -0 core/libs/album/album.cpp M +8 -1 core/libs/album/album.h M +2 -2 core/libs/models/abstractalbummodel.cpp M +0 -14 core/libs/models/abstractalbummodelpriv.h https://commits.kde.org/digikam/f8d8dc6ebdbbb0f75561a6a4dc6a0a95d728ca42 Git commit dcb01e39023564ea538eb06f2cf635a451f713e3 by Maik Qualmann. Committed on 24/08/2018 at 21:44. Pushed by mqualmann into branch 'master'. implement a child album cache hash this commit reduces the start of digiKam here with 11000 albums from 1:15 minutes to 0:35 minutes Related: bug 368468 M +16 -5 core/libs/album/album.cpp M +6 -1 core/libs/album/album.h M +2 -2 core/libs/models/abstractalbummodel.cpp M +0 -23 core/libs/models/abstractalbummodelpriv.h https://commits.kde.org/digikam/dcb01e39023564ea538eb06f2cf635a451f713e3 (In reply to Jens from comment #40) > I can't see the images in the folders (but they do exist in the filessytem) > and the scanning process is still at 17% with about 1 image per half second. Just the same. Images are there, but thumbnails are not showed in any folder. > If the duplicate scanning takes so much time because the images are all the > same, will it work better if I create a huge amount of small random JPEG > files that are all different in content? Or does the duplicate scanner only > take metadata int account? I can't answer to that, but Gilles or Maik. ;) Git commit 6d16a4f96ac245ed11450326c128cf63ca5a1332 by Maik Qualmann. Committed on 24/08/2018 at 23:17. Pushed by mqualmann into branch 'master'. implement a child album to row cache hash this commit reduces the start of digiKam here with 11000 albums from 0:35 minutes to 0:20 minutes the sorting of the albums and entries is now almost without delay. Related: bug 368468 M +14 -4 core/libs/album/album.cpp M +12 -5 core/libs/album/album.h M +1 -16 core/libs/models/abstractalbummodelpriv.h https://commits.kde.org/digikam/6d16a4f96ac245ed11450326c128cf63ca5a1332 Git commit f27ab9c1051bd0a0bba6e79bc77899c74a7e6bf8 by Maik Qualmann. Committed on 07/10/2018 at 13:47. Pushed by mqualmann into branch 'master'. add a global cache for grouped images When we load the images into the Icon view, we ask each time, whether there are grouped images, with 30000 images in the view are that also 30000 SQL query. With this patch, the time to load a view with many images is faster with MySQL 3x and with SQLite 2x. Related: bug 391840, bug 398921, bug 397901 M +24 -0 core/libs/database/coredb/coredb.cpp M +5 -0 core/libs/database/coredb/coredb.h M +1 -10 core/libs/database/item/imageinfo.cpp M +19 -2 core/libs/database/item/imageinfocache.cpp M +7 -0 core/libs/database/item/imageinfocache.h M +0 -3 core/libs/database/item/imageinfodata.h https://commits.kde.org/digikam/f27ab9c1051bd0a0bba6e79bc77899c74a7e6bf8 digiKam 7.0.0 stable release is now published: https://www.digikam.org/news/2020-07-19-7.0.0_release_announcement/ We need a fresh feedback on this file using this version. Best Regards Gilles Caulier Well, I changed scenary since I reported this bug: - MySQL (internal) DB - 65k pictures in a folder Curiosly, when I sort by size, is near instantly, but when I sort by name, it takes about 30 seconds. I understand it could be cause one field is numerical and the other is a string, but if it is indexed, it should be instantly too. The sorting does not take place in the database, but in the Qt Item model. A string is slower when comparing. This is normal, especially if the differences only appear at the end, e.g. with long path names. Keep in mind that all 65,000 strings are compared several times until the correct order is established. For me, it takes around 6 seconds to rearrange for 60,000 items, for an already much older computer. The QColator class that carries out the sorting offers the possibility to create a key beforehand, then the sorting is as fast as with the date. I have already implemented it as a test. We only gain time if the user would change the view with many items more often. The first time you open a large view, there are no advantages. In your bug description, do you write that a folder of 50 images takes 4 minutes? Maik Maik (In reply to Maik Qualmann from comment #48) > The sorting does not take place in the database, but in the Qt Item model. A > string is slower when comparing. This is normal, especially if the > differences only appear at the end, e.g. with long path names. Keep in mind > that all 65,000 strings are compared several times until the correct order > is established. For me, it takes around 6 seconds to rearrange for 60,000 > items, for an already much older computer. The QColator class that carries > out the sorting offers the possibility to create a key beforehand, then the > sorting is as fast as with the date. I have already implemented it as a > test. We only gain time if the user would change the view with many items > more often. The first time you open a large view, there are no advantages. Well, the story is larger, but I'll resume. The "real" user is who has a +50K photos with thousands of folders and subfolders and despite he has a high-level computer, he was suffering exactly the issue (even using an SSD for the stored database). Cause I have no access nowadays to his computer, I created a (new) script that creates a folder with 3 subfolders with 65k jpg images (with distinct content and resolution) each one (I'll share the script, when I add more subfolders to try to get it to be a more real scenario). And in this scenario, it takes near 30 seconds to order by name an album elements (in a 4 years old PC). As I said, it should not take much more time than when I sort by size. > > In your bug description, do you write that a folder of 50 images takes 4 > minutes? > Yes, in the real PC, when I notified the bug, that was the real time elapsed. > Maik > > Maik Created attachment 130679 [details]
New script (2020) to create 4 folders with 10k distinct images each one
This script try to create an album similar to real life folders/images. Creates 1 folder with 4 subfolders (level 1) with 4 subfolders (level 2) each one. Inside each one of this folders in level 2 are created 2500 jpg images with distinct resolution and content. I made it cause it's very useful to check some issues like this related when sorting by filename.
Git commit d63e171bec0910f036bb3c2b261ab3333af110ee by Maik Qualmann. Committed on 10/10/2020 at 20:22. Pushed by mqualmann into branch 'master'. add experimental QCollatorSortKey cache for fast string sorting Add quick cache comparison to item and album sorting. Changing a view with about 30,000 items when sorting by name or path previously took about 22 seconds. Now about 2-3 seconds. We will observe how the memory consumption develops. Related: bug 368468 M +4 -8 core/libs/database/models/itemsortsettings.h M +3 -6 core/libs/models/albumfiltermodel.cpp M +96 -12 core/libs/threadimageio/fileio/loadingcache.cpp M +8 -0 core/libs/threadimageio/fileio/loadingcache.h https://invent.kde.org/graphics/digikam/commit/d63e171bec0910f036bb3c2b261ab3333af110ee I close the bug now. With the new item sorter cache there are no problems sorting many items by file name or path. Maik With Digikam version will be patched? Thank you |