Summary: | digikam duplicates icons for TIFF iles | ||
---|---|---|---|
Product: | [Applications] digikam | Reporter: | Paweł Rumian <gorkypl> |
Component: | Albums-IconView | Assignee: | Digikam Developers <digikam-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | axel.krebs, elle, gorkypl, languitar, marcel.wiesweg |
Priority: | NOR | ||
Version: | 1.0.0 | ||
Target Milestone: | --- | ||
Platform: | Gentoo Packages | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | 1.7.0 | |
Sentry Crash Report: |
Description
Paweł Rumian
2009-10-12 19:18:12 UTC
Of course I meant thumbnails, not icons, sorry if that was misleading in any way (but I hope that the screenshot is self-explanatory)... What happens if you change the modification date of one of the affected images: touch Negative....tif ? OK, I've got some results, but no resolution yet. After doing simple $ find . -name *.tif -exec touch {} \; the thumbnails changed, but are still duplicated. Adding 'sleep 1' between touches hasn't changed much. I will try to examine it further... It seems that digikam takes the most recent .tif thumbnail and assigns it to many other images, but not always. Sometimes few thumbnails are unaffected, but only until the next .tif file has its ctime changed. Neverthless, I can't yet see how the modification of ctime affects the generated (and duplicated) thumbnails. Remote debugging is difficult. Can you send me sample pictures? I need at least two for this problem. More would be better, dont know how large the files are. If necessary you can send them by private mail. Private mail has been sent - I hope you'll be able to reproduce the problem... Were you able to reproduce the bug? Yes, I can reproduce. The problem is that all images - have exactly the same file size - contain bit by bit the same metadata - have the same creation date (none in metadata) - have the same first 8k of data, bit by bit. That is enough to make digikam believe it's all the same file... Not sure about a good solution. When creating a list of criteria as above, someone will come who has created files that slip through the loopholes. Why are these criteria needed? Isn't a file uniquely identified by it's path on the disk? You can move, copy or rename files anytime, you can have backup collections and thus multiple times the same picture in your collection. You can even completely screw up your collection settings, just add a new collection and no tag is lost. It's pretty reassuring. For normal photos taken with a digital camera, the criteria are always sufficient. The problem here, with identical metadata and identical filesize (completely uncompressed?? not even lossless compression?) we are hitting a corner case. The additional problem is that obviously the first 8kb do not contain pixel data. It is not a problem with photos from an ordinary camera, indeed. But I have hit it several times when batch-scanning photos, and in these cases the severity of this bug is high - digikam becomes quite unusable, because one cannot see and identify photos before opening them... Maybe we should consider identifying the photos by some kind of content-dependent criterium? Like md5sum or something similar? It's a content-based hash, but not over the whole file, only over parts, more precisely, the first 8kb. It's assumed that within the first 8k image data is contained. That also fails for your pictures. So your peculiarities here include: - apparently no compression, normally lossless compression already results in differing file sizes - no metadata - identical file content in first 8kb. A possible solution is to extend the 8kb, or take other small data parts from the middle and end of the file. I must think about the implications of changing the hash creation. You are not forgotten. Exiv2 author Andreas Huggel has analyzed the files and indeed, the first 8kb are identical: There is a list of image strip pointers (5600 bytes) and strip sizes (same count, always same value). This takes up the first 12kb. So the suggestion is: increase the value from in dimgloader.cpp 8192 to 102400 (100kB) for a workaround. The problem is that there are now a lot of databases around with hashes, so changing this algorithm cannot be done just anytime. If we do that, then well prepared, or optionally. SVN commit 1205197 by mwiesweg: Implement uniqueHash V2. The hash has now a very simple specification: First 100 kB, last 100 kB. All problematic cases known to me are solved. 1) Any new database created with 2.0 will use the new hash. That means you cannot use it with 1.x. 2) Any upgraded database from 1.x will keep the old hash. That means you can use it in parallel with 1.x. 3) There is a button to carry out an explicit update on the Database setup page, for those that want the new hash for an updated database When upgrading, the thumbnail database will be updated in parallel, so nothing is lost. 4) The HistoryImageId in the history XML in metadata will only use the V2 hash, because it is effectively not possible to specify the generation of the old hash, while the V2 hash is easily specified. If you have an updated DB with V1 hash, the history image id may not always contain the hash. (it is optional) BUG: 210353 M +38 -1 digikam/scancontroller.cpp M +8 -0 digikam/scancontroller.h M +32 -1 libs/database/albumdb.cpp M +10 -0 libs/database/albumdb.h M +52 -14 libs/database/collectionscanner.cpp M +2 -0 libs/database/collectionscanner.h M +3 -0 libs/database/imageinfo.cpp M +12 -1 libs/database/imagescanner.cpp M +6 -0 libs/database/imagescanner.h M +92 -23 libs/database/schemaupdater.cpp M +8 -2 libs/database/schemaupdater.h M +5 -0 libs/database/thumbnaildatabaseaccess.cpp M +1 -0 libs/database/thumbnaildatabaseaccess.h M +7 -0 libs/database/thumbnaildb.cpp M +2 -0 libs/database/thumbnaildb.h M +28 -0 libs/dimg/dimg.cpp M +15 -0 libs/dimg/dimg.h M +46 -2 libs/dimg/loaders/dimgloader.cpp M +1 -0 libs/dimg/loaders/dimgloader.h M +1 -1 libs/widgets/common/databasewidget.h M +75 -5 utilities/setup/setupdatabase.cpp M +5 -0 utilities/setup/setupdatabase.h WebSVN link: http://websvn.kde.org/?view=rev&revision=1205197 *** Bug 259880 has been marked as a duplicate of this bug. *** *** Bug 262452 has been marked as a duplicate of this bug. *** |