Bug 222774 - many (but not all) tags have been lost after multiple albums facility added
Summary: many (but not all) tags have been lost after multiple albums facility added
Status: RESOLVED WORKSFORME
Alias: None
Product: digikam
Classification: Applications
Component: Tags-Engine (show other bugs)
Version: 1.1.0
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-01-14 23:03 UTC by Jonathan Marten
Modified: 2012-06-27 10:52 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 1.2.0


Attachments
Screen shot showing missing images in tag view (255.73 KB, image/png)
2010-01-14 23:05 UTC, Jonathan Marten
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jonathan Marten 2010-01-14 23:03:57 UTC
Version:            (using Devel)
OS:                Linux
Installed from:    Compiled sources

I've been running digikam from KDE4 trunk for some time, and had about 60000 images tagged with about 250 tags in total.

Late last year, IIRC, the multiple albums feature was added.  When this first appeared I set the collection path appropriately, a single "Local Collection" with the same path as before.  After doing that, all of the image files seemed to be still present but many of them (but not all) had lost their tags and do not show up in a tag view or search.  The tag totals seem to be correct.

See the attached screenshot (some details redacted for privacy) for an illustration of this happening.  If the numbers in brackets are correct there should be 71 files with the tag "Mountain Passes", but only 7 are shown.

Browsing to an image, in the album or calendar view, that was previously tagged with this tag shows that it now has no tags at all.  This seems to be the case for all affected files - either all previous tags remain intact, or they all disappear.

Please don't tell me I have to go through and tag all of my images again... ;-(
Comment 1 Jonathan Marten 2010-01-14 23:05:16 UTC
Created attachment 39907 [details]
Screen shot showing missing images in tag view
Comment 2 caulier.gilles 2010-01-15 10:49:45 UTC
Which digiKam version you use ?

Gilles Caulier
Comment 3 Jonathan Marten 2010-01-15 11:15:59 UTC
Currently:

Version 1.1.0 (rev.: 1073781)
Using KDE Development Platform 4.4.59 (KDE 4.4.59 (KDE 4.5 >= 20100107))
Comment 4 Marcel Wiesweg 2010-01-15 18:22:22 UTC
Which version did you use to convert from your old database (digikam3.db) to the new format (digikam4.db)?
Comment 5 Jonathan Marten 2010-01-15 19:52:41 UTC
No sure when the digikam3 -> digikam4 conversion happened.  Would it have been on the first run of digikam from KDE4 trunk, about May 2009 (when I ran into bug 193522)?

That was using "0.11.0-svn (rev.: 968562)".

Some tags would have been set before this point (in the digikam3 database), more have been added since.
Comment 6 Marcel Wiesweg 2010-01-21 18:32:10 UTC
No idea currently. You could provide your digikam4.db file for inspection.
Comment 8 Marcel Wiesweg 2010-01-23 16:44:29 UTC
You have two collection defined.
One is on the removable storage with partition UUID bfab1e03-ba88-49a6-8c20-6d70537bc07c and on that partition with the relative path /share/photos.
The other is manually added to the fixed mount path /share/photos on a hard-wired storage device.

You see only the images on the second collection, because the first one is probably not found currently (which is a normal situation for removable storage).

Now you should figure out what is the correct collection layout.
Comment 9 Jonathan Marten 2010-01-24 17:28:54 UTC
There were indeed two collections defined in digikam settings (but it wasn't obvious at first glance, because the second was simply labelled "Col0".  Not sure where this name came from.

Have now changed the settings for a single local collection rooted at /share/photos (I'm assuming that this is the correct one to use for an automounter path).  However, after a collection rescan the same tags are still missing.

I've done a bit of looking round in the database (with sqlitebrowser) for one of the images that has lost its tags.  These are what hopefully are the relevant parts of the tables:

Table "Images", rows with "name" = the problem image name:

row      id       album   status   uniqueHash
6        6                3        b7b8a975...
36168    36462            3        b7b8a975...
66449    66743            3        47473fec...
97020    97314            3        47473fec...
127756   128050           3        47473fec...
158502   158796           3        47473fec...
189242   189536   95      1        47473fec...

Table "ImageTags", rows with "imageid" = "id" as above:

row      imageid    tagid
2466     6          1
2467     6          3
2468     6          71
25033    36462      1
25034    36462      3
25035    36462      71

These match the tags as would have been originally set.  There are no entries in this table matching any of the other "id"s.

The current uniqueHash for the image in question (as calculated in  DImgLoader::uniqueHash) is indeed 47473fec...

Could the uniqueHash calculation for the image have changed between the time that it was tagged and now?  The file modtime on disc has not changed, and is the same as for unaffected images in the same directory.
Comment 10 Marcel Wiesweg 2010-01-24 20:05:47 UTC
Yes indeed, the hash is different for the obviously identical image in the two collections. I do not remember that the hash calculation changed, maybe the file changed, like after writing metadata. When rescanning, digikam can only copy tags from images which the same hash of course.

Difficult situation now, you have tag changes in both versions of each file probably?
Comment 11 Jonathan Marten 2010-01-26 11:57:18 UTC
Have checked the problem image file against backups going back to 2008 (long before I started using the KDE4 digikam).  The size, modtime and MD5 are identical to the current version, so nothing has changed in the source file.

The uniqueHash is based on the complete image Exif data, plus the first 8Kb of the file, plus the file size.  Assuming that the second and third of these have not changed, is there any possibility that the format of the extracted Exif data could have changed? - possibly only in some data-dependent way, which explains why not all files are affected.

As you say, a difficult situation.  Currently I'm resigned to tagging all of the affected images again, but I'd like to be sure that the work involved will not all be wiped out by some possible problem in future.  Will raise a separate bug for that.

Since nothing is reproducible, and current digikam creates and saves new tags with no problem, the best option would be to close this bug as WORKSFORME?
Comment 12 Marcel Wiesweg 2010-01-26 19:04:11 UTC
You are right, the Exif formatting could indeed have changed. It is done by libexiv2.

You needed that hash only because you had this problem with broken collections. Normally, if files just stay where they are, you dont need the feature to identify identical files by hash.

If it's all right for you we can close as WORKSFORME. Probably this is difficult to solve by SQL.