Summary: | Special Chars in Keywords decode wrong in IPTC | ||
---|---|---|---|
Product: | [Applications] digikam | Reporter: | Johann-Nikolaus Andreae <johann-nikolaus> |
Component: | Metadata-Iptc | Assignee: | Digikam Developers <digikam-bugs-null> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | ahuggel, caulier.gilles, Johan.Eneland, kde-2011.08, lz |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | unspecified | ||
OS: | Linux | ||
Latest Commit: | https://invent.kde.org/graphics/digikam/commit/ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f | Version Fixed In: | 7.1.0 |
Sentry Crash Report: | |||
Attachments: |
screenshot metadata sidebar
JPG image with Hebrew IPTC info. In all fields is the word שלום JPG image with English IPTC data. IPTC and UTF8 from BrillantPhoto displayed in digiKam and Photoshop IPTC Encoding patch for libkexiv2 IPTC Encoding patch for digikam IPTC tag in hebrew (iso-8859-8) from PhotoStation |
Description
Johann-Nikolaus Andreae
2006-08-11 09:00:57 UTC
This is not a problem with digiKam. In fact IPTC metadata is limited to ASCII charactors ! This problem will be fixed when Exiv2 library will support XMP metadata witch support UTF8. Gilles Caulier SVN commit 592268 by cgilles: digikam from trunk : strings from Exiv2 to render metadata content are ascii, not local 8 bits formated. If we use a linux dist using UTF8 encoding (like Suse 10.1 for ex.), some characters can be wrongly decoded. CCBUGS: 132244 M +7 -7 exifwidget.cpp M +7 -7 gpswidget.cpp M +7 -7 iptcwidget.cpp M +7 -7 makernotewidget.cpp --- trunk/extragear/graphics/digikam/libs/widgets/metadata/exifwidget.cpp #592267:592268 @@ -149,7 +149,7 @@ for (Exiv2::ExifData::iterator md = exifData.begin(); md != exifData.end(); ++md) { - QString key = QString::fromLocal8Bit(md->key().c_str()); + QString key = QString::fromAscii(md->key().c_str()); // Decode the tag value with a user friendly output. QString tagValue; @@ -161,7 +161,7 @@ { std::ostringstream os; os << *md; - tagValue = QString::fromLocal8Bit(os.str().c_str()); + tagValue = QString::fromAscii(os.str().c_str()); } tagValue.replace("\n", " "); @@ -178,7 +178,7 @@ catch (Exiv2::Error& e) { kdDebug() << "Cannot parse EXIF metadata using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return false; } @@ -203,12 +203,12 @@ { std::string exifkey(key.ascii()); Exiv2::ExifKey ek(exifkey); - return QString::fromLocal8Bit( Exiv2::ExifTags::tagTitle(ek.tag(), ek.ifdId()) ); + return QString::fromAscii( Exiv2::ExifTags::tagTitle(ek.tag(), ek.ifdId()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag title using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("Unknow"); } @@ -220,12 +220,12 @@ { std::string exifkey(key.ascii()); Exiv2::ExifKey ek(exifkey); - return QString::fromLocal8Bit( Exiv2::ExifTags::tagDesc(ek.tag(), ek.ifdId()) ); + return QString::fromAscii( Exiv2::ExifTags::tagDesc(ek.tag(), ek.ifdId()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag description using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("No description available"); } --- trunk/extragear/graphics/digikam/libs/widgets/metadata/gpswidget.cpp #592267:592268 @@ -275,12 +275,12 @@ for (Exiv2::ExifData::iterator md = exifData.begin(); md != exifData.end(); ++md) { - QString key = QString::fromLocal8Bit(md->key().c_str()); + QString key = QString::fromAscii(md->key().c_str()); // Decode the tag value with a user friendly output. std::ostringstream os; os << *md; - QString tagValue = QString::fromLocal8Bit(os.str().c_str()); + QString tagValue = QString::fromAscii(os.str().c_str()); // We apply a filter to get only standard Exif tags, not maker notes. if (d->keysFilter.contains(key.section(".", 1, 1))) @@ -309,7 +309,7 @@ d->detailsButton->setEnabled(false); d->detailsCombo->setEnabled(false); kdDebug() << "Cannot parse EXIF metadata using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return false; } @@ -334,12 +334,12 @@ { std::string exifkey(key.ascii()); Exiv2::ExifKey ek(exifkey); - return QString::fromLocal8Bit( Exiv2::ExifTags::tagTitle(ek.tag(), ek.ifdId()) ); + return QString::fromAscii( Exiv2::ExifTags::tagTitle(ek.tag(), ek.ifdId()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag title using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("Unknow"); } @@ -351,12 +351,12 @@ { std::string exifkey(key.ascii()); Exiv2::ExifKey ek(exifkey); - return QString::fromLocal8Bit( Exiv2::ExifTags::tagDesc(ek.tag(), ek.ifdId()) ); + return QString::fromAscii( Exiv2::ExifTags::tagDesc(ek.tag(), ek.ifdId()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag description using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("No description available"); } --- trunk/extragear/graphics/digikam/libs/widgets/metadata/iptcwidget.cpp #592267:592268 @@ -126,12 +126,12 @@ for (Exiv2::IptcData::iterator md = iptcData.begin(); md != iptcData.end(); ++md) { - QString key = QString::fromLocal8Bit(md->key().c_str()); + QString key = QString::fromAscii(md->key().c_str()); // Decode the tag value with a user friendly output. std::ostringstream os; os << *md; - QString value = QString::fromLocal8Bit(os.str().c_str()); + QString value = QString::fromAscii(os.str().c_str()); // To make a string just on one line. value.replace("\n", " "); @@ -157,7 +157,7 @@ catch (Exiv2::Error& e) { kdDebug() << "Cannot parse IPTC metadata using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return false; } @@ -181,12 +181,12 @@ { std::string iptckey(key.ascii()); Exiv2::IptcKey ik(iptckey); - return QString::fromLocal8Bit( Exiv2::IptcDataSets::dataSetTitle(ik.tag(), ik.record()) ); + return QString::fromAscii( Exiv2::IptcDataSets::dataSetTitle(ik.tag(), ik.record()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag title using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("Unknow"); } @@ -198,12 +198,12 @@ { std::string iptckey(key.ascii()); Exiv2::IptcKey ik(iptckey); - return QString::fromLocal8Bit( Exiv2::IptcDataSets::dataSetDesc(ik.tag(), ik.record()) ); + return QString::fromAscii( Exiv2::IptcDataSets::dataSetDesc(ik.tag(), ik.record()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag description using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("No description available"); } --- trunk/extragear/graphics/digikam/libs/widgets/metadata/makernotewidget.cpp #592267:592268 @@ -175,12 +175,12 @@ for (Exiv2::ExifData::iterator md = exifData.begin(); md != exifData.end(); ++md) { - QString key = QString::fromLocal8Bit(md->key().c_str()); + QString key = QString::fromAscii(md->key().c_str()); // Decode the tag value with a user friendly output. std::ostringstream os; os << *md; - QString value = QString::fromLocal8Bit(os.str().c_str()); + QString value = QString::fromAscii(os.str().c_str()); value.replace("\n", " "); // We apply a filter to get only standard Exif tags, not maker notes. @@ -196,7 +196,7 @@ catch (Exiv2::Error& e) { kdDebug() << "Cannot parse MAKERNOTE metadata using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return false; } @@ -220,12 +220,12 @@ { std::string exifkey(key.ascii()); Exiv2::ExifKey ek(exifkey); - return QString::fromLocal8Bit( Exiv2::ExifTags::tagTitle(ek.tag(), ek.ifdId()) ); + return QString::fromAscii( Exiv2::ExifTags::tagTitle(ek.tag(), ek.ifdId()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag title using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("Unknow"); } @@ -237,12 +237,12 @@ { std::string exifkey(key.ascii()); Exiv2::ExifKey ek(exifkey); - return QString::fromLocal8Bit( Exiv2::ExifTags::tagDesc(ek.tag(), ek.ifdId()) ); + return QString::fromAscii( Exiv2::ExifTags::tagDesc(ek.tag(), ek.ifdId()) ); } catch (Exiv2::Error& e) { kdDebug() << "Cannot get metadata tag description using Exiv2 (" - << QString::fromLocal8Bit(e.what().c_str()) + << QString::fromAscii(e.what().c_str()) << ")" << endl; return i18n("No description available"); } Johann, please checkout current implementation from svn (not 0.9.0-beta2), and let's me hear is this commit have solved your problem. Note: my comments #1 still right. UTF8 is not supported by IPTC. If an application try to embed UFT8 string in an IPTC tags, well the IPTC specification is not respected. Look here: http://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf The alternative is to use XMP metadata instead. Gilles Caulier 2006/10/4, Gilles Caulier <caulier.gilles@free.fr>: > Note: my comments #1 still right. UTF8 is not supported by IPTC. If an application try to embed UFT8 string in an IPTC tags, well the IPTC specification is not respected. Look here: > > http://www.iptc.org/std/IIM/4.1/specification/IIMV4.1.pdf > For me, it's not that clear in the specification. The character set can be defined in the envelop record (dataset 1:90) which is normaly not used (as I understand the specs, the whole spec was made to encapsulate picture in IIMV file, not encapsulate IOTC infos in picture files). Other specification sections let me think UTF8 is possible : "Section 1.12 DataSet octet sizes do not imply character sizing. The number of characters will depend on the encoding method specified. The number of octets specified within a DataSet Data Field Octet Count will always be equal to or greater than the number of characters of data represented." There is also the definition of UTF8 in Section 1.75. The more standard way should probably be using a record 1 with a 1:90 dataset to define UTF8 but I think most programs just use UTF8 directly in the text fields. After some googling, I found the following page (http://bugs.php.net/bug.php?id=27238) with links with files with Record1 charset info but unfortunatly, the links are broken. I found also some links with IPTC software showing their UTF8 support. I'll try to do some tests with differents IPTC writing software. Loic Thanks for this report Loic. Andreas, you have a better experience with IPTC than me. Can you confirm that we can use UTF8 encoding in IPTC text tags using Exiv2 library ? Thanks in advance Gilles Created attachment 18756 [details]
screenshot metadata sidebar
As I understand above discussion, the screenshot I added is all about that
problem. I noticed this behaviour before, and decided to change the
copyrightnotice into (C)... But in the IPTC-documents I read that in Europe it
is probably best for juridical reasons to use the copyrightsign. So I changed
it back for all my fotographs with the use of the exiv2 commandline-tool.
Within digiKam this leads to the accompanying result. What I mean to say is
that apparantly the IPTC (needs to) accept(s) this kind of characters.
Caspar.
Gilles, Regarding the patch above, digikam code needs to distinguish between metadata (data stored as tag values) and text that comes from exiv2 (tag titles, descriptions, error messages, etc). Metadata is encoded according to whatever the relevant (Exif or IPTC) standard defines, possibly different for different tags (Exif user comment has its own charset setting for example). Text from exiv2 is currently in ASCII only but when we support gettext, that will change. What character set do the translation files use? -ahu. To the question in comment #5: You can store any data in the tags, exiv2 usually doesn't care. But I don't know whether storing UTF8 encoded text in IPTC fields is ok and how it should be done to comply with the standard. Forwarded the question to the exiv2 list. -ahu. Andreas, Since we have implemented NLS support in Exiv2, the code patched in #2 is obsolete. In current implementation, i use QString::fromLocal8Bit() when its require. There is a digiKam screenshot with non-ascii characters (French) at this url: http://digikam3rdparty.free.fr/Screenshots/dgikam_metadata_tags_i18n.png Gilles Andreas, Look this page : http://peccatte.karefil.com/Software/Metadata.htm#IPTC sorry it's in French...but it's very instructive. I have never seen an English page about Extended char with IPTC (UTF-8 like). Especiall, there is a section witch said : « Le modèle IPTC-NAA permet de coder les champs selon divers jeux de caractères étendus. Les logiciels actuels devraient donc être capables de gérer correctement les accents, les signes diacritiques, etc. Il n'en est rien - si l'on utilise des caractères étendus lors de la saisie des informations dans Photoshop par exemple, ces informations ne sont pas correctement affichées sur une autre plate-forme. Adobe préconise de n'utiliser que l'ASCII 7 bits [ce qui est inacceptable pour beaucoup de langues!] parce que le standard IPTC n'autorise que ce jeu de caractères [ce qui est faux!] » To resume, IPTC can support extended char set but because Photoshop only support ASCII 7bits char (with IPTC, not XMP), all others applications must only support this mode. If you look into IPTC spec page 20, the tag Iptc.Envelope.CharacterSet is designed for personalize char encoding Gilles On Thursday 28 December 2006 23:37, Marco Piovanelli wrote: > yes, the IPTC standard does allow for non-ASCII > character sets, although it's by no means obvious how these > are specified. See for instance Stefano Bettelli's excellent > description of JPEG metadata on CPAN for a brief discussion of > this: > ><http://www.annocpan.org/~BETTELLI/Image-MetaData-JPEG-0.15/lib/Image/MetaData/JPEG/TagLists.pod> > > In particular, you can safely assume IPTC strings are > UTF-8-encoded if the "Iptc.Envelope.CharacterSet" dataset contains > the three-byte escape sequence "\x1B%G". Thanks for the url Andreas. I will trying to use it and check the interoperability with Photoshop... Gilles Created attachment 19315 [details]
JPG image with Hebrew IPTC info. In all fields is the word שלום
This image was tagged with IPTC data on BrilliantPhoto on Windows. In the
following fields is the following info:
Caption:שלום
Keywords:מפתח, שלום
People:אתי, שלום
Event:שלום
Place:שלום
The data is in UTF-8. Note that multiple Keywords and People are seperated by a
comma and a space.
Created attachment 19316 [details]
JPG image with English IPTC data.
This image was tagged with IPTC data on BrilliantPhoto on Windows. In the
following fields is the following info:
Caption:Caption
Keywords:Keyword1, Keyword2
People:Person1, Person2
Event:Event
Place:Place
The data is UTF-8. This and the previous attachment were added at the request
of Gilles on the Digikam mailing list.
Dotan, Is BrilliantPhoto can configure the char-set encoding used with IPTC ? Are you a screenshot of setup ? Gilles BrilliantPhoto has absolutly no setup screen. There are no configurable options, and therefore no Preferences nor Options dialogs. That's actually one of the things that I _don't_ like about it, but otherwise it was a great program. Acording to the BrilliantPhoto forums, which have since been taken down, the IPTC spec specifically requires the use of UTF-8 for the data. No other charset is acceptable. I read the spec a long time ago and in my opinion that 'fact' is debateable. However, the BrilliantPhoto author was very certain that only UTF-8 is allowed. If you have a Windows virtual machine, I'd very much recommend downloading and trying BrilliantPhoto: http://www.download.com/BrilliantPhoto/3000-2204_4-10326351.html Digikam could learn quite a few things from BP, such as the wonderfull "fill flash" feature, which brightens underexposed photos better than any other program I've yet seen. The red-eye reduction selector is ROUND, like EYES, so they affect less skin. Why does no other program do that? Should I continue to list BP's other great features? >Digikam could learn quite a few things from BP, such as the wonderfull "fill >flash" feature, which brightens underexposed photos better than any other >program I've yet seen. Already implemented in current implementation : http://digikam3rdparty.free.fr/Screenshots/exposureindicatorsfromimageplugins.png http://digikam3rdparty.free.fr/Screenshots/underexposureindicator.png http://digikam3rdparty.free.fr/Screenshots/overerexposureindicator.png http://digikam3rdparty.free.fr/Screenshots/exposureindicatorSetup.png >The red-eye reduction selector is ROUND, like EYES, so they affect less skin. The red eyes corrector need to be improved in digiKam ==> in my TODO list. >Why does no other program do that? Should I continue to list BP's other great >features? yes, on devel ML, not in this room. Gilles I don't have those options in my 0.9.0 built from the tarball. I'll build from SVN and try it out. As for the BP features, I'll subscribe to the Digikam DevML. Thanks. Dotan, With #14, Are you sure than your attached picture is in UTF8. If digikam failed to show UTF8 char from IPTC, Why i can show it without problem in digiKam... Also, the "envelope" IPTC tags is not set in this picture to ping application about char encoding... Gilles Dotan, With the image from #13 all char are broken. Sure this one is certainly encoded using UTF-8... but the "envelope" IPTC tag is not set. There is no way to find witch encoding is used in IPTC to decode text from this picture. I have tried to show all IPTC informations from this picture using Photoshop 7.x, and all text strings are broken like digiKam ! If you read the IPTC Spec. this "Envelope" IPTC tag must be set properlly, else all text tags are unsuitable. I suspect a bug in BrillantPhoto. Gilles Created attachment 19374 [details]
IPTC and UTF8 from BrillantPhoto displayed in digiKam and Photoshop
It would not surprise me to learn of a bug of the sort in BrilliantPhoto. In any case, the program is abandonware (not being developed by those who purchased the rights to it), so the issue is mute. I'm certain that a simple shell script could add the appropriate fields, should anybody need it. I'm not the guy to write it, though. (Trivia: What movie was this from? "If the milk's sour, I ain't the kind of pussy to drink it.") Hi all, Per previous discussion with Gilles on digikam-devel mailing list, I am posting patches that allow the user to specify which encoding to use for IPTC comments. There are two patches, for libkexiv2 and digikam itself. These are made against the current SVN. The part of the digikam patch that modifies iptcwidget.cpp is to be considered temporary and will not be needed once that widget is converted to use libkexiv2. Gilles, please take a look. Thanks! Created attachment 19710 [details]
IPTC Encoding patch for libkexiv2
Created attachment 19711 [details]
IPTC Encoding patch for digikam
ok, I will take a look in your patch monday morning. Gilles lz, I have take a look into your patch. I have a question : why you store the "IPTC Encoding" setting in KDE global. This value must be stored in application setting and passed to a virtual KExiv2 method by the derived class DMetadata. Like this libkexiv2 do not depand anymore of KDE core (in the future, i will certainly remove the KDElib depency and let only Qt depency to have only a pure Qt interface). Gilles Hi Gilles, I chose to store the settings in kdeglobal, because it is needed not only by Digikam itself, but also by Digikam kioslaves and, in theory, any application that uses libkexiv2. To be more precise, I've found that DMetadata is used in the kioslaves, but class AlbumSettings is not, and thus I could not read this setting from there. Do Digikam kioslaves have access to Digikam app config? Hi Gilles, Just wanted to ask if you had any time to look at my patch further. Thanks, Leonid Any progress on this? Johannes I have a patch on my computer to support UTF-8 with IPTC, but i'm not yet fully satisfied by it. I will working on between digiKam 0.9.2-beta1 and beta2 release... Gilles I tried those patches sent here earlier but got errors when patching. Which version should they be applied? Johannes the patch is on my computer, not in B.K.O Gilles Just for completeness a brief quote from Gilles on the IRC wrt this bug: The patch is not fine as it does not respect the IPTC norm. IPTC provide a tag to specify the char encoding. The patch tries to detect the char encoding to parse the string, especially the first char which can include specific sub string to declare encoding. This is not how the IPTC works. The patch must be re-written. In connection to the previous comment, just want to mention that not every program that writes IPTC caption tags sets the IPTC encoding tag. For example, Picasa and IrfanView do not. Therefore the world must be full of IPTC-tagged images with encoding tag unset. Leonid, Well, i think than if IPTC char encoding tag is not set, content must be interpreted like ASCII... But it just my first impression. Else, the problem is than digiKam must respect the IPTC norm, especially when we want to update or add tags: 1/ Detect the original encoding of tags. 2/ Set the encoding tag for all others tags (UTF8 is the universal encoding. ExifTool use only this one to add/update iptc) 3/ Convert and update all existing IPTC tag to UTF8 if original encoding is different. Gilles > Well, i think than if IPTC char encoding tag is not set, content must be
> interpreted like ASCII... But it just my first impression.
Please consider this: first try utf-8, then try locale of
system, ASCII should be last resort.
m.
Hi Gilles, I completely agree that DigiKam should be strict in setting the tag correctly. However, I think it also should be liberal in reading the IPTC comments with encoding tag unset. I don't think one can assume that the comments are in ASCII, for non-English comments they won't. Even trying UTF-8 and system locale, as Mikolay above suggests, will not achieve full interoperability with Windows applications (something I an very keen about). Indeed, for a Russian-speaking user, any Windows application will write the comments in Windows CP1251 encoding; at the same time under a Unix system, the locale is likely to be either Russian KOI8-R or UTF-8. This is why I advocated adding a configration option for encoding to use in IPTC comments (which could default to UTF-8 of course). I also wouldn't want digiKam to convert and update all existing IPTC tag to UTF8 if original encoding is different. The issue is, as I mentioned before, that many other applications will not recognize UTF-8 in IPTC comments and display raw Unicode characters, i.e. garbage. Is it possible to at least put a warning on the indentity panel in order to tell people that their info won't be taken in, until this is fully fixed? When I type my name "Ménard", it simply rejects the é altogether and ends up with Mnard. First time I did this I thought I had mistyped and then realized that the field just ate my accentuated character. Sorry for the trouble. The warning is already there. At the bottom. *** This bug has been confirmed by popular vote. *** i use digikam 0.9.2-final (deb unstable) with kexiv2 0.1.5 + exiv2 0.14.0 when i get pictures from PhotoStation (Windows application for Image management) with Hebrew IPTC tags i can not see them, i get...(���) when i opened Dotan's file from Comment #14 i got to see the correct hebrew word in the IPTC tag , but i'm unable to write UTF8 inside the tags (as you probably know). and the EXIF comment is unreadable. it would be nice if i could have a combo box to choose the encoding for the tags i see in the editor and to be able to enter new text according to what i choose. it's a workaround and i know it's not the standard way to do this but it seems that different applications implement the IPTC standard in different ways. or maybe, an option to encode the IPTC tags before i send them away to some other people how use a different application. Created attachment 21322 [details]
IPTC tag in hebrew (iso-8859-8) from PhotoStation
here is an image i got from PhotoStation (a Windows application) with hebrew
IPTC tags (i think it's ISO-8859-8 or windows-1255) that is displaying the tags
in unreadable squares.
Nadav, Just want to mention that I have an unofficial patch that does approximately what you wanted. It adds a combo box with IPTC encoding selection to Digikam's configuration dialog (Metadata page). Once the encoding is set, you can read existing IPTC comments in this encoding and write new ones, and they are saved in this encoding. The patch for Digikam 0.9.2 is here: http://www.csltd.com.ua/~lz/digikam/digikam-0.9.2-iptc-encoding-lz.patch, but you also need to patch libkexiv2, the patch is here: http://www.csltd.com.ua/~lz/digikam/libkexiv2-0.1.5-iptc-encoding-lz.patch. If you try the patches, please let me know (lz@europe.com). wow :-) that's exactly what i meant! will this be added to the main Digikam tree ? (i'll probably give it a try in a few days :-) Question to Gilles :-) Please give it a try and let me know. i've tested Leonid Zeitlin patch and it works fine. when i set the right encoding and the IPTC fields are readable :-) thou, i can't write any thing with is not pure ASCII :-( i think the language encoding combo should be in the IPTC editor and not in the main Setting, because i get different pictures from around the world with different encodings and it is much easier to change for each picture view. or maybe there should be a default fail over encoding inside the Setting dialog and the IPTC editor should get the right encoding from the beginning of the text string ? (if it's in the specs ? at all) Thank you Leonid :-) for this beautiful patch ! Nadav :-) Hello Is there any progress on that bug? It would be great to be able to use UTF-8 for IPTC like libiptcdata! *** Bug 155733 has been marked as a duplicate of this bug. *** Same here. As digiKam and kipi-plugins for KDE4 support XMP everywhere and use it by default instead IPTC, the problem become obsolete. I close this file now. Gilles Caulier > Same here. As digiKam and kipi-plugins for KDE4 support
> XMP everywhere and use it by default instead IPTC, the
> problem become obsolete.
Gilles, I am not certain that the problem is obsolete just because of proper XMP support. The world is full of photos tagged with Irfanview, Photoshop < CS1, and other applications that have written and still do write IPTC. These photos follow a very popular specification, and Digikam does not yet follow that specification. This would be like abandoning HTML 3 support in a web browser just because the browser now support HTML 5. The old documents are still in use.
Dotan, It's obsolete because digiKAm and kipi-plugins use XMP instead IPTC now to manage keywords and others UTF8 strings. The IPTC => XMP transition is open since a while now in photography world. For me it's a waste of time to trying to fix this problem with IPTC wich has _never_ supported UTF-8 as a standard. XMP is really really better than IPTC. Gilles Caulier Dotan, To be more clear, i close this file as WONTFIX... Gilles Caulier It's your call, Gilles! I trust you to do what you feel is best for Digikam. Git commit ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f by Gilles Caulier. Committed on 28/08/2020 at 05:19. Pushed by cgilles into branch 'master'. IPTC and Utf8 support: If a tag is string, check if global IPTC characterset is null to convert in latin1, else we expect to interpret the string as utf8. We use std::string accessor from Exiv2 to get an Utf8 cenversion of string. If it do not work, well this problem need to be reported as UPSTREAM to Exiv2 as pre-cenversion of string is not done in background by the library. This patch prevent to display latin1 string with a wrong Utf8 conversion which can break some characters. BUGS: 379581 BUGS: 379050 FIXED-IN: 7.1.0 M +27 -3 core/libs/metadataengine/engine/metaengine_iptc.cpp https://invent.kde.org/graphics/digikam/commit/ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f |