Bug 324329 - IPTC encoding problem with non English language
Summary: IPTC encoding problem with non English language
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Metadata-Iptc (show other bugs)
Version: 3.3.0
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-08-31 21:22 UTC by Alexis Ntounas
Modified: 2020-08-28 07:48 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 7.1.0


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alexis Ntounas 2013-08-31 21:22:50 UTC
If I add a Title and a Caption to a jpg image in Greek the IPTC metadata of Object Name and Caption-Abstract appear as ???????. If I add some English characters in the Title or in the Caption they appear ok. I have tested the IPTC metadata both with exiftool and gwenview.

Reproducible: Always

Steps to Reproduce:
1. Add Title and Captions in a language other than English (Greek for sure) to an image
2. Apply changes so that metadata is written to the image
3. Open the image with exiftool or gwenview
Actual Results:  
Object Name and Caption-Abstract appear as ???????

Expected Results:  
Object Name and Caption-Abstract should be readable
Comment 1 Alexis Ntounas 2013-08-31 21:30:13 UTC
Forgot to mention. I have some images whose tags I wrote in Greek by using Digikam 3.1.0. Greek metadata appears fine for those images in gwenview.
Comment 2 caulier.gilles 2013-08-31 21:31:25 UTC
IPTC is a very old standard which only support ASCII characters and strings size limitations.

XMP is the replacement,  with no limit. It support UTF-8 encoding.

To resume : use XMP instead...

Gilles Caulier
Comment 3 Alexis Ntounas 2013-08-31 21:41:45 UTC
I have also tried XMP. When I see the metadata in gwenview in the Meta Infrormation panel I get

Title: lang="x-default" My title here
Image Description: lang="x-default" My description here

lang="x-default" appearing there is a bug of gwenview or this is the way it is supposed to appear?
Comment 4 caulier.gilles 2020-08-28 07:48:15 UTC
Git commit ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f by Gilles Caulier.
Committed on 28/08/2020 at 05:19.
Pushed by cgilles into branch 'master'.

IPTC and Utf8 support: If a tag is string, check if global IPTC characterset is null to convert in latin1, else we expect to interpret the string as utf8.
We use std::string accessor from Exiv2 to get an Utf8 cenversion of string. If it do not work, well this problem need to be reported as UPSTREAM
to Exiv2 as pre-cenversion of string is not done in background by the library.
This patch prevent to display latin1 string with a wrong Utf8 conversion which can break some characters.
BUGS: 379581
BUGS: 379050
FIXED-IN: 7.1.0

M  +27   -3    core/libs/metadataengine/engine/metaengine_iptc.cpp

https://invent.kde.org/graphics/digikam/commit/ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f