Bug 151719

Summary: patch: non latin1 in iptc keywords
Product: [Applications] digikam Reporter: Piotr Tarnowski <piotr_tarnowski>
Component: Metadata-IptcAssignee: Digikam Developers <digikam-bugs-null>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In: 0.10.0
Attachments: enable non-latin1 in iptc keyword names

Description Piotr Tarnowski 2007-11-01 22:01:36 UTC
Version:           0.9.2-final (using KDE KDE 3.5.5)
Installed from:    Compiled From Sources
OS:                Linux

I know that in digiKam 0.10.0 XMP will be used for keywords, but for inpatient (like me) i provide workaround for using entering non-latin1 keyword names. Everything is implemented in metadatahub.cpp class:

1) if you want to store keywords in another (8-bit) encoding, you should add a tag which path ends with 'tags.encoding/<encoding>' example "extra/my.tags.encoding/ISO-8859-2"

2) load method checks for such a tag and if notices one corrects encoding of other tags

3) write method does the same

I know that this is quick and dirty but it works and what is  important stores information about encoding within image (so you can have images with different tags encoding and you do not have to remember which one has what encoding)
Comment 1 Piotr Tarnowski 2007-11-01 22:02:51 UTC
Created attachment 21976 [details]
enable non-latin1 in iptc keyword names
Comment 2 caulier.gilles 2007-11-01 22:24:35 UTC
Piotr,

This is the way. You need to set the Iptc.Envelope.CharacterSet properlly to describe witch encoding is used in _all_ iptc text tags.

A method need to check if this tag is alreday set and use it as well accordinly.

If this tag is not set, interpret iptc tag text content using a default text codec (set in config metadata page) and re-encrypt all text tag using the new codec, _accordinly_ with the new char encoding settings.

This is the only way to respect operability over others photo tool !

Note : all this job need to be done in libkexiv2, not digiKam core. Unforget than kipi-plugins need to do it also... There is also a report in B.K.O about this subject :

http://bugs.kde.org/show_bug.cgi?id=132244

The patch provided in this bugzilla entry is a first approach, but not _all_ is right. In fact, Iptc.Envelope.CharacterSet is never repected at all... So take a care about this code... This is why i have not yet intergrated the patch in svn.

Read the IPTC/IIM spec for details of possible char encoding in IPTC:

http://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf

Take a look also in this interresing thread about how to use IPTC UTF-8 char encoding with ExifTool : 

http://www.cpanforum.com/threads/2114#2115

Gilles Caulier
Comment 3 caulier.gilles 2007-11-01 22:28:02 UTC
Piotr,

"This is the way" => "This is not the right way" (:=)))...

Gilles
Comment 4 Piotr Tarnowski 2007-11-02 22:42:24 UTC
Gilles,

if most other applications does not care about Iptc.Envelope.CharacterSet and XMP in digiKam is on the way (which as I understand will replace IPTC) does it make sense to re-implement everything with all complexity required by standard? 

Personally I will not be able to do that, and time is not the most important reason. I am Java programmer now, and I wrote my last C++ application 10 years ago, so what I'm doing now is programming by finding similar constriction rather than full understanding about what is really happening (especially in memory leak and object destruction area). What I did is useful for me and I thought it can also be useful (temporary) for others.

/Piotr
Comment 5 caulier.gilles 2008-12-08 08:39:43 UTC
This entry is obsolete now. XMP is supported everywhere in digiKam and kipi-plugins for KDE4. 

I close this file now.

Gilles Caulier