Summary: | THUMBDB : rebuild all thumbnails does not get rid of all thumbnails first | ||
---|---|---|---|
Product: | [Applications] digikam | Reporter: | Gerard Dirkse <gerard.dirkse> |
Component: | Database-Thumbs | Assignee: | Digikam Developers <digikam-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | caulier.gilles, marcel.wiesweg, mario.frank, swatilodha27 |
Priority: | NOR | ||
Version: | 3.2.0 | ||
Target Milestone: | --- | ||
Platform: | openSUSE | ||
OS: | Linux | ||
Latest Commit: | https://commits.kde.org/digikam/a1f67531b3269941df5ff531baa3e487cf58f1fc | Version Fixed In: | 5.5.0 |
Sentry Crash Report: |
Description
Gerard Dirkse
2013-08-19 12:51:48 UTC
Probably your DB file have been corrupted. I never reproduce this problem here. I currently work on Maintenance tool to support Multicore CPU. I run a lots of test on huge image collection... Gilles Caulier 1) So why do I have 4 times the number of thumbnails then I have images ? (may have been caused by my renaming actions) 2) Why does rebuild ALL thumbnails not first delete COMPLETE content of tables in thumbs ? (that would solve it). 1) So why do I have 4 times the number of thumbnails then I have images ? (may have been caused by my renaming actions) Or it's versionning feature. If you edit and save as new version, a new file is created and show as current version of item in icon-view. All others previous version are cached from icon-view, excepted if you turn off right option from Setup dialog. For each version file, one thumbnail is created... 2) Why does rebuild ALL thumbnails not first delete COMPLETE content of tables in thumbs ? (that would solve it). It must. item deletion in DB is performed item by item in fact. Very rarely I use versioning, 9 out of 10 times, I choose overwrite existing version, so that does not explain the more then 4 times number of thumbnails then there are images. Browsing (using phpMyAdmin, DB is in MYSQL) the thumbs DB before and after the rebuild action in tables Filepaths and Customidentifiers (field path and identifier respectively) I see references to files that don't exist any more, either as a result moving the NFS mount point, but also from files that have been renamed using the digikam image rename option. I would have expected rebuild ALL thumbnails to start with with a 'DELETE * from ..' each and every table in the thumbs DB and start repopulating them as a result of the rebuild. That will leave a clean thumbs DB after a rebuild ALL. You say 'It must. item deletion in DB is performed item by item in fact.', that is then the bug, because all the cases I mention, i.e. where image with that name no longer exists will not get deleted. Marcel, There is a way to clean up thumbs DB before to rebuild all thumbnails ? Currently, i use this method : https://projects.kde.org/projects/extragear/graphics/digikam/repository/revisions/master/entry/utilities/maintenance/thumbstask.cpp#L80 Gilles Caulier Marcel, Do you see my previous comment ? Gilles Caulier In the meantime I developed some php programs to go through the database of images and thumbnails and eliminated almost 75% of my number and size of thumbnails. Greatest gain was achieved by using the Uniquehashes table to eliminate all entries and associated thumbnails from this table where there was no uniquehas/filesize combination in the images table. Havent figured out yet were all these obsolete entries cam from. Gilles, would you simply like to clean out all thumbnails? In SQL, that's simply "DELETE FROM Tumbnails" to delete all thumbnail data. The trigger should clean the rest of the tables. If we want a sort of garbage collector, it would need to be something along what Gerard has developed, checking that a uniqueHash/filesize identifier from the albumDB still exists in the main database. Marcel as you can see in code from thumbnailtask.cpp:line 84 : d->catcher->thread()->deleteThumbnail(d->path); We only delete valid previous file registered in DB. It do not clear all other dummy entries. A garbage collector can be a powerful tool to prevent to rebuild all items. but for each album to process, we can clean all items, including all garbage entries... this can be most simple to implement. There is no method implemented in this way currently. Right ? Note : the real question here is why garbage entries are present in DB. When an item is removed or disappear, thumb is DB is not removed automatically ? Gilles Thumbnails are primarily loaded via hash/file size. So in principle, whenever file contents or file size change, there can be a leftover entry in the database. These can only be found via "go through thumbnail db -> check if it exists in main db". Today, when digikam changes file content, the thumbnail reuse/replacement is often managed, but probably not from all places. It cannot be managed when an external tool does the change. So a classical case for a garbage collector. Is the file still valid using digiKam 5.1.0? Please test and provide necessary updates. Can you reproduce the problem using digiKam Linux AppImage bundle ? The last bundle is available at this url: https://drive.google.com/drive/folders/0BzeiVr-byqt5Y0tIRWVWelRJenM Gilles Caulier There is a patch that introduces garbage collection as maintenance stage before thumbnail rebuild here: https://bugs.kde.org/show_bug.cgi?id=374591 . This could solve your problem. But be advised: The patch is still in testing phase. So, backup your databases before you test. New 5.5.0 AppImage is done with garbage database collector patches. Uploading to GDrive is under progress. It will be online in few minutes at usual place : https://drive.google.com/drive/folders/0BzeiVr-byqt5Y0tIRWVWelRJenM New database Garbage Collector options are there : https://www.flickr.com/photos/digikam/32549923912/in/dateposted-public/ https://www.flickr.com/photos/digikam/32549923632/in/dateposted-public/ Gilles Caulier Git commit a1f67531b3269941df5ff531baa3e487cf58f1fc by Mario Frank. Committed on 08/02/2017 at 14:09. Pushed by mfrank into branch 'master'. Merged the garbage collection into master. The garbage collector is a maintenance stage that runs before the rebuild of thumbnails and must be triggered explicitely. It removes stale image entries in core db and if enabled also stale thumbnails and face identities from thumbnails and recognition DB. If configured so, the core DB part of the garbage collector removes stale image entries in core db during the start of digiKam. Note that cleaning the databases does not necessarily make them smaller as no auto-vacuum is done on the databases. The vacuuming proces differs highly between the three supported database variants (SQLite, internal MySQL and external MySQL). Thus, currently there is no automatism. Related: bug 374591 FIXED-IN: 5.5.0 M +3 -1 NEWS https://commits.kde.org/digikam/a1f67531b3269941df5ff531baa3e487cf58f1fc |