Bug 183171 - Decode and encode metadata once only
Summary: Decode and encode metadata once only
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Metadata-Engine (show other bugs)
Version: 0.10.0
Platform: Compiled Sources Unspecified
: NOR wishlist
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-04 13:52 UTC by Andreas Huggel
Modified: 2017-08-13 15:40 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 1.3.0


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Huggel 2009-02-04 13:52:29 UTC
Version:            (using Devel)
Installed from:    Compiled sources

In the current design, metadata is decoded from the binary form in the image to an Exiv2 metadata container when an image is loaded. Immediately after this, the metadata is encoded again (for JPEG) and is internally stored in a binary container.

Since decoding and encoding of metadata is an expensive operation and is potentially lossy, the proposed design is that image metadata is decoded only once - when the image is loaded - and encoded only when the image is written. Internally digiKam should keep the decoded metadata in a container (which it doesn't copy).
Comment 1 Andreas Huggel 2009-02-04 14:10:34 UTC
Comments from Marcel (from bug# 182738, comment 17):

> This means that our current getExif implementation is not
> lossless and not reversible.

Yes, that's correct. The first problem, which was always there,
is that unknown TIFF structures potentially get corrupted on
writing. The ability to write to TIFF images exposed another
problem: In a TIFF image, the metadata is the image and
vice-versa. There is no isolated block of data somewhere to store
the Exif data in. This means that the metadata must potentially
be changed to ensure that the image remains intact.

Therefore, "Encoding ExifData" should now always be thought of
as "encoding ExifData for a given image format" and is
potentially lossy. How lossy depends on the input data, i.e., the
source image format. (This problem doesn't apply to IptcData and
XmpData, these are reversible).

> - what you suggest is storing ExifData, IptcData, XmpData
> directly? This is your intended workflow?

Yes, to keep it in the Exiv2::*Data containers and only encode
it again, using the appropriate Parser class for the target image
format.

> - these classes use std::vector internally, which means copying
> is always a deep copy? (We would have to write a wrapper for
> Qt-style shallow copies)

Currently yes. What's the use-case that requires such a shallow copy?
I agree that copying the containers is also not desired, the simplest
way to avoid this is using references.

> - should ExifParser::encode be used for JPEGs only when you say
> it removes certain tags?

Yes, ExifParser::encode creates a binary block for a JPEG APP segment.

> - there are two ExifParser::encode methods. One has docs and says
> it does not do unintrusive writing, suggesting the other
> method. The other has no docs and requires a byte* buffer. Should
> this method be of any interest to us?

The second method should be used if original binary metadata is
still available. This is the case, eg, if the metadata is read
from a JPEG, modified and written back. (in this case
Exiv2::Image does this internally, you don't need to call the
Parser directly.) I'll add to the documentation.

-ahu.
Comment 2 Marcel Wiesweg 2009-02-04 17:18:10 UTC
> > - these classes use std::vector internally, which means copying
> > is always a deep copy? (We would have to write a wrapper for
> > Qt-style shallow copies)
> 
> Currently yes. What's the use-case that requires such a shallow copy?
> I agree that copying the containers is also not desired, the simplest
> way to avoid this is using references.

If you program Qt you get used to pass around objects (string, byte array, image) by value and internally this is implemented very cheap and clever so that copies are only done when data is changed. Of course references is still cheaper, but using them as return value is very limited. So nothing strictly requires this. And it's fine for me of course that libexiv2 uses std::vector, I just want to know about this.
---

I would suggest to solve this for 0.11 because it is a code change that, although not too difficult, I would like to have beta-tested.
Comment 3 caulier.gilles 2009-06-12 16:01:39 UTC
Marcel,

What's news here ?

Gilles Caulier
Comment 4 Marcel Wiesweg 2009-06-12 19:24:53 UTC
Yes this is somewhere on my list and I would like to solve that
Comment 5 Andreas Huggel 2009-06-14 16:46:58 UTC
Maybe a candidate for the coding-sprint in autumn. Is it on?

Andreas
Comment 6 caulier.gilles 2009-06-14 17:48:48 UTC
Agree, if you come to the event at Genoa (:=)))

Gilles
Comment 7 Marcel Wiesweg 2009-11-14 20:05:26 UTC
SVN commit 1049205 by mwiesweg:

Use a shared private data class to host the Exiv2::*Data objects.
Provide convenient access methods in KExiv2Priv.

CCBUG: 183171


 M  +1 -1      kexiv2_p.cpp  
 M  +28 -9     kexiv2_p.h  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1049205
Comment 8 Marcel Wiesweg 2009-11-14 20:08:29 UTC
SVN commit 1049206 by mwiesweg:

Provide a public class KExiv2Data which can carry the Exiv2::*Data objects 
without exposing them. 

CCBUG: 183171


 M  +50 -30    kexiv2.cpp  
 M  +27 -2     kexiv2.h  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1049206
Comment 9 Marcel Wiesweg 2009-11-15 11:57:15 UTC
SVN commit 1049493 by mwiesweg:

KDE4.4 libkexiv2 will provide KExiv2Data container.
Add compatibility code for older kexiv2 versions to offer the same
API (passing around KExiv2Data objects, but then still containing the encoded
Exif data instead of the Exiv2::*Data objects)

CCBUG: 183171

 M  +38 -0     dmetadata.cpp  
 M  +12 -0     dmetadata.h  
 A             dmetadatadata.h   [License: GPL (v2+)]


WebSVN link: http://websvn.kde.org/?view=rev&revision=1049493
Comment 10 Marcel Wiesweg 2009-11-15 11:57:23 UTC
SVN commit 1049494 by mwiesweg:

Adapt DImg API to store a KExiv2Data object instead of metadata byte arrays.

CCBUG: 183171

 M  +8 -66     dimg.cpp  
 M  +4 -20     dimg.h  
 M  +2 -1      dimg_p.h  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1049494
Comment 11 Marcel Wiesweg 2010-04-11 17:04:09 UTC
SVN commit 1113705 by mwiesweg:

These patches were lost in between all the branching. Recommitting.
The bug is fixed when using a recent (KDE 4.4) libkexiv2.

BUG: 183171

 M  +2 -1      NEWS  
 M  +2 -27     digikam/metadatahub.cpp  
 M  +1 -4      imageplugins/enhance/lensautofixtool.cpp  
 M  +1 -10     libs/database/imagescanner.cpp  
 M  +8 -70     libs/dimg/dimg.cpp  
 M  +4 -20     libs/dimg/dimg.h  
 M  +2 -1      libs/dimg/dimg_p.h  
 M  +12 -33    libs/dimg/loaders/dimgloader.cpp  
 M  +1 -1      libs/dimg/loaders/dimgloader.h  
 M  +4 -2      libs/dimg/loaders/pngloader.cpp  
 M  +1 -4      libs/dimg/loaders/tiffloader.cpp  
 M  +40 -2     libs/dmetadata/dmetadata.cpp  
 M  +14 -2     libs/dmetadata/dmetadata.h  
 A             libs/dmetadata/dmetadatadata.h   [License: GPL (v2+)]
 M  +2 -10     libs/imageproperties/imagepropertiessidebardb.cpp  
 M  +2 -11     libs/jpegutils/jpegutils.cpp  
 M  +3 -30     utilities/imageeditor/canvas/dimginterface.cpp  
 M  +1 -3      utilities/imageeditor/canvas/dimginterface.h  
 M  +3 -16     utilities/imageeditor/editor/imageiface.cpp  
 M  +1 -3      utilities/imageeditor/editor/imageiface.h  
 M  +2 -11     utilities/queuemanager/basetools/metadata/assigntemplate.cpp  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1113705