Bug 199148 - Iptc ascii versus unicode problem
Summary: Iptc ascii versus unicode problem
Status: RESOLVED FIXED
Alias: None
Product: digikam
Classification: Applications
Component: Metadata-Iptc (show other bugs)
Version: 1.0.0
Platform: unspecified Linux
: NOR wishlist
Target Milestone: ---
Assignee: Digikam Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-07-06 12:54 UTC by Philippe ROUBACH
Modified: 2020-08-28 07:41 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 7.1.0


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philippe ROUBACH 2009-07-06 12:54:53 UTC
Version:           1.0.0-beta2 (using 4.2.95 (KDE 4.2.95 (KDE 4.3 RC1)), Mandriva Linux release 2010.0 (Cooker) for i586)
Compiler:          gcc
OS:                Linux (i686) release 2.6.30.1-desktop-1mnb

iptc comments must be in ascii. ok

but when using right panel to add comments
i use unicode wich is the linux default setting today

result : iptc comments is not in ascii

same pb if i type comment using "iptc modify" tool then iptc comment is not in ascii

please provide an automatical conversion to the users using language as french which contains special symbol as é for example

also you need it to make synchronization between iptc and xmp

is there a batch tool to convert to ascii all my iptc comments ?

thanks
Comment 1 Mikolaj Machowski 2009-07-06 13:04:20 UTC
No. IPTC doesn't have to be in ASCII. According to Metadata Working Group they should be reencoded to UTF-8 and write info about this to proper field (1:190)
Comment 2 Philippe ROUBACH 2009-07-06 13:27:02 UTC
ok but why is there a warning in iptc modify tool about comment must be in ascii ?

why in the iptc block in the photo "ascii" preceeds comments ?

why jalbum photoweb and picasawed display bad symbols instead of special symbol as é ?
 
i ask to jalbum dev.
he says the comments is annouced as asccii in iptc block then the comment is convert to unicode then é is badly displayed.
happily in jalbum i can ask if i want to convert or not to unicode an already unicode comment.
Comment 3 Mikolaj Machowski 2009-07-06 13:39:28 UTC
Because this is recent agreement because companies: Adobe, Nokia, Sony, Canon, Microsoft.

In the past this issue wasn't clear and many programs (most significant were Adobe products) were treating everything like ASCII. But IPTC-IIM has tools for supporting of non ASCII encodings and digiKam should use them.
Comment 4 caulier.gilles 2009-07-06 14:24:35 UTC
But but but :

XMP replace IPTC and do not have IPTC limitation : assii char encoding + string size !

I recommend to forget IPTC and use XMP instead.

Gilles
Comment 5 Philippe ROUBACH 2009-07-06 14:33:50 UTC
(In reply to comment #3)

so standards change.

what i understand :

In digikam 9.5 , i use currently, i have ascii tagged comment in iptc block
in digikam 1.0.0 i will use in the future i have no ascii tagged comment in
iptc comment

how user manages this change when switching to 1.0.0 ?

and more : apps are not synchronized. i use digikam, jalbum, picasa
Comment 6 caulier.gilles 2009-07-06 14:41:48 UTC
Other arguments in favor to XMP : multiple language comment support : this is done in 1.0.0-beta2.

IPTC do not support that.

For me, trying to wrap around all IPTC limitation to improve IPTC support in digiKam is a waste of time. XMP support is fully implemented...

Gilles Caulier
Comment 7 Philippe ROUBACH 2009-07-06 14:43:53 UTC
(In reply to comment #4)

i agree with you the futur is xmp
your solution is good : to workaround pb use xmp comment
but

jalbum uses xmp comment but picasa i am not sure
Comment 8 caulier.gilles 2009-07-06 14:49:22 UTC
recent Picasa version, support XMP i think... I receive some test images from users to test.

Gilles Caulier
Comment 9 Mikolaj Machowski 2009-07-06 18:20:12 UTC
I think all "major" software (Lightroom, Aperture, FotoStation, Bridge) are mainly using XMP now, with syncing to IPTC-IIM sometimes[1]. But there are problems with "minor" software which due to cheapness, or no-price are still immensely popular, there is also problem with some legacy photos which were commented with IPTC tags (also including local encodings!). Latter should be fully readable in digiKam (no ?s), former - like JAlbum here - also should be someway supported. Background syncing with IPTC-IIM isn't bad solution.

[1] FotoStation claims syncing with IPTC-IIM but does it in such way that many programs aren't able to read those blocks...
Comment 10 Philippe ROUBACH 2009-07-06 18:43:42 UTC
(In reply to comment #8)
> recent Picasa version, support XMP i think... I receive some test images from
> users to test.
> 
> Gilles Caulier

i made a test with picasaweb and digikam 1.0.0b2

i upload a photo with comment containing an é then é is replaced by é 

i have a doubt about picasa supporting xmp

i assume it supports previous iptc practice : ascii
Comment 11 Philippe ROUBACH 2009-07-10 10:58:24 UTC
(In reply to comment #6)
> For me, trying to wrap around all IPTC limitation to improve IPTC support in
> digiKam is a waste of time. XMP support is fully implemented...

with digikam 0.9.6 i have no pb with special symbols (as é for example) when uploading to picasaweb
with digikam 1.0.0 i have a pb with special symbols (as é for example) when uploading photos previously managed with 0.9.6 to picasaweb (yes a photo with xmp comment synchronized)

when i will switch to digikam 1.0.0 with mandriva 2010.0 in october
how i manage the pb of the specials symbol in my photos ?

i understand its good for the workflow to switch from ascii to unicode but
where is the tool for the user for transition ?
Comment 12 caulier.gilles 2009-07-20 12:45:15 UTC
Philippe,

For me, working on IPTC UTF-8 support is waste of time. Picasa/Flickr and others photo management program MUST support XMP as well... Please report at the right place in closed source world...

Gilles Caulier
Comment 13 caulier.gilles 2020-08-28 07:36:30 UTC
Git commit ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f by Gilles Caulier.
Committed on 28/08/2020 at 05:19.
Pushed by cgilles into branch 'master'.

IPTC and Utf8 support: If a tag is string, check if global IPTC characterset is null to convert in latin1, else we expect to interpret the string as utf8.
We use std::string accessor from Exiv2 to get an Utf8 cenversion of string. If it do not work, well this problem need to be reported as UPSTREAM
to Exiv2 as pre-cenversion of string is not done in background by the library.
This patch prevent to display latin1 string with a wrong Utf8 conversion which can break some characters.
BUGS: 379581
BUGS: 379050
FIXED-IN: 7.1.0

M  +27   -3    core/libs/metadataengine/engine/metaengine_iptc.cpp

https://invent.kde.org/graphics/digikam/commit/ad0ab9efeba6e2fe3bb86207a91499e4e8eb170f