Bug 438888

Summary: Exiftool - non ASCII characters
Product: [Applications] digikam Reporter: herb <herb.k>
Component: Metadata-ExifToolAssignee: Digikam Developers <digikam-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: caulier.gilles, metzpinguin
Priority: NOR    
Version: 7.3.0   
Target Milestone: ---   
Platform: Microsoft Windows   
OS: Microsoft Windows   
Latest Commit: Version Fixed In: 7.3.0
Sentry Crash Report:
Attachments: attachment-19675-0.html

Description herb 2021-06-18 20:01:23 UTC
SUMMARY
I worked with digiKam 7.3.0 snapshot 17.6.2021 on a Windows 10 system.

I opened a test.jpg with Showfoto
- the image is stored inside a directory with non ASCII characters
  ( German umlauts )
- the image contains Exif, IPTC and XMP metadata and all contain strings 
  with non ASCII characters (Chinese characters)

In Showfoto I clicked on Metadata (in right panel) and tab Exiftool.
Here I see that the filename/directoryname is not displayed properly.
The IPTC-metadata (with chinese characters) are also not displayed properly.

In tab Exiftool Exif and XMP metadata are displayed properly.

By the way, please allow another question:
Why are the metadata displayed by Exiftool in "unconverted" format?


STEPS TO REPRODUCE
1. 
2. 
3. 

OBSERVED RESULT


EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 caulier.gilles 2021-06-19 06:16:08 UTC
Hi,

Exif, Makernotes, Iptc, and XMP metadata view are based on Exiv2 library, not ExifTool. Exiv2 has limitations compared to ExifTool. Exiv2 support UTF8.

ExifTool metadata view render all metadata supported by ExifTool. The way to populated information in this view is completely different, even if you found Exif, Iptc and Xmp info here.

ExifTool report information by a JSON container to the metadata view. Typically, data are encoded to UTF8 by default, as we can read in the documentation. I strange to see string badly encoded here.

Gilles Caulier
Comment 2 caulier.gilles 2021-06-19 06:17:11 UTC
Can you share a file with non ascii characters in metadata to double check here. 

Thanks in advance

Gilles Caulier
Comment 3 Maik Qualmann 2021-06-19 06:44:00 UTC
Git commit d6d333679747fffa63d22644244432d84def8753 by Maik Qualmann.
Committed on 19/06/2021 at 06:43.
Pushed by mqualmann into branch 'master'.

try with the charset command for ExifTool

M  +3    -0    core/libs/metadataengine/exiftool/exiftoolprocess.cpp

https://invent.kde.org/graphics/digikam/commit/d6d333679747fffa63d22644244432d84def8753
Comment 4 Maik Qualmann 2021-06-19 07:17:32 UTC
@Gilles
It's probably just about the file properties that are displayed in ASCII. We use local 8Bit so that e.g. file paths with German special characters work. In the ExifTool ListViewer, these file names and paths are replaced by their special characters with a "?".

Maik
Comment 5 caulier.gilles 2021-06-19 07:24:38 UTC
We use local 8 bits... hum where exactly ?

Gilles
Comment 6 Maik Qualmann 2021-06-19 07:26:04 UTC
> By the way, please allow another question:
> Why are the metadata displayed by Exiftool in "unconverted" format?

Do you mean that e.g. a lens name is not displayed with its real name? I just see this for my lens. At the moment, however, cannot test whether it is displayed in the ExifTool command line.

Maik
Comment 7 Maik Qualmann 2021-06-19 07:27:47 UTC
(In reply to caulier.gilles from comment #5)
> We use local 8 bits... hum where exactly ?

We use it on Windows when we pass the file path. Tests with UTF8 and UTF-16 did not work.

Maik
Comment 8 herb 2021-06-19 07:58:46 UTC
Hello,

thanks to all for your investigations.

Please let me answer your question of comment 6 first
with an example of Exiftool output (in German)
Tag                    unconverted             converted
LensID                 0 14 10                 Olympus M.Zuiko Digital ...
ExposureMode           1                       Manuelle Belichtung
ExposureProgram        4                       Blendenautomatik
Orientation            1                       Horizontal (Normal)
GPSLongitude           11.8962133              11 53.7728

For me the "converted (by Exiftool)" output is much better to read.
Comment 9 herb 2021-06-19 08:08:04 UTC
Hello,

here a comment to Comment 1:
You wrote:
ExifTool report information by a JSON container to the metadata view. Typically, data are encoded to UTF8 by default, as we can read in the documentation.

As all Exif tag-values with unicode-characters are displayed properly in tab Exiftool, I guess you started Exiftool with option -charset exif=utf8.

As all IPTC tag-values with unicode-characters are NOT displayed properly in tab Exiftool, I guess you did not start Exiftool with option -charset iptc=utf8.
But this is just a guess.


You also wrote: "...report information by a JSON container..."
I hope you are aware that this container does not allow double entries.
So in case of a tag is > 1 available inside the metadata, only one of them is contained in json-output.
Comment 10 caulier.gilles 2021-06-19 09:30:44 UTC
>You also wrote: "...report information by a JSON container..."
>I hope you are aware that this container does not allow double entries.
>So in case of a tag is > 1 available inside the metadata, only one of them is >contained in json-output.

JSON container is generated by ExifTool. So i guest this point is already handle by ExifTool as well...

Gilles Caulier
Comment 11 caulier.gilles 2021-06-19 09:44:32 UTC
>As all Exif tag-values with unicode-characters are displayed properly in tab >Exiftool, I guess you started Exiftool with option -charset exif=utf8.

No bu this must be done by default by ExifTool. This is no risk to add this option.

As all IPTC tag-values with unicode-characters are NOT displayed properly in tab Exiftool, I guess you did not start Exiftool with option -charset iptc=utf8.
But this is just a guess.

Idem.

Pointer to code : 

https://invent.kde.org/graphics/digikam/-/blob/master/core/libs/metadataengine/exiftool/exiftoolprocess.cpp#L132

Gilles Caulier
Comment 12 caulier.gilles 2021-06-19 09:47:16 UTC
From ExifTool documentation:

"-charset [[TYPE=]CHARSET]

    If TYPE is ExifTool or not specified, this option sets the ExifTool character encoding for output tag values when reading and input values when writing, with a default of UTF8. If no CHARSET is given, a list of available character sets is returned."

So we must append "-charset UTF8" in ExifTool CLI argument. This will turn on UTF8 for all metadata parsed by ExifTool.

Gilles Caulier
Comment 13 caulier.gilles 2021-06-19 09:54:23 UTC
In your test image shared by private email, IPTC miss char encoding tag as something like that:

IPTC:CodedCharacterSet=utf8

https://exiftool.org/forum/index.php?topic=4052.0

Gilles Caulier
Comment 14 Maik Qualmann 2021-06-19 10:58:35 UTC
Which program did you use to add the tags? Windows Explorer? Then it should be UTF 16 and this contradicts the metadata standard.

Maik
Comment 15 herb 2021-06-19 11:20:39 UTC
Hello,

@ Gilles:
You wrote: In your test image shared by private email, IPTC miss char encoding tag as something like that: IPTC:CodedCharacterSet=utf8

As far as I know this tag is not mandatory.

@ Maik:
To write tags into files I always used Exiftool.

Best regards
herb
Comment 16 herb 2021-06-19 11:27:42 UTC
Hello,

a comment to your comment 10:
JSON container is generated by ExifTool. So i guest this point is already handle by ExifTool as well...

That is not always true. Depending on the options/parameters you send to Exiftool asking for tags and their values, Exiftool will build a json file. In case of double tags Exiftool does not include all, but only one, because of the json standard. 

For one detail please see (in Exiftool forum):
https://exiftool.org/forum/index.php?topic=11604.msg62230#msg62230

Best regards
herb
Comment 17 caulier.gilles 2021-06-19 11:31:29 UTC
This tag is _highly_ recommended for interoperability. If it's not present, tags encoding is interpreted as... ASCII by default.
Comment 18 Maik Qualmann 2021-06-19 11:44:03 UTC
Your test image shows all entries with special characters incorrectly here under Linux with the ExifTool in the command line, regardless of whether they are XMP, EXIF or IPTC. Was it possibly tagged with an older faulty version of ExifTool?

ExifTool Version Number         : 12.25
File Name                       : Photo_Metadata_IPTC_unicode.JPG
Directory                       : .
File Size                       : 4.6 MiB
File Modification Date/Time     : 2020:03:18 14:19:02+01:00
File Access Date/Time           : 2021:06:19 12:27:41+02:00
File Inode Change Date/Time     : 2021:06:19 12:27:40+02:00
File Permissions                : -rw-r--r--
File Type                       : JPEG
File Type Extension             : jpg
MIME Type                       : image/jpeg
Exif Byte Order                 : Little-endian (Intel, II)
Image Description               : All三山gemeine Beschreibung des Bildinhaltes
Make                            : OLYMPUS IMAGING CORP.
Camera Model Name               : E-PL6
Orientation                     : Horizontal (normal)
X Resolution                    : 350
Y Resolution                    : 350
Resolution Unit                 : inches
Software                        : 三山
Modify Date                     : 2019:01:19 11:51:16
Artist                          : artist1三山
Y Cb Cr Positioning             : Co-sited
Copyright                       : Copyright 三山 2020 Ego Michse
Exposure Time                   : 1/640
F Number                        : 9.0
Exposure Program                : Shutter speed priority AE
ISO                             : 200
Sensitivity Type                : Standard Output Sensitivity
Exif Version                    : 0230
Date/Time Original              : 2019:01:19 11:51:16
Create Date                     : 2020:04:16 08:09:10
Offset Time                     : +02:00
Offset Time Original            : +02:00
Offset Time Digitized           : +02:00
Components Configuration        : Y, Cb, Cr, -
Exposure Compensation           : -0.3
Max Aperture Value              : 4.0
Light Source                    : Unknown
Flash                           : On, Did not fire
Focal Length                    : 150.0 mm
Special Mode                    : Normal, Sequence: 0, Panorama: (none)
Camera ID                       : OLYMPUS DIGITAL CAMERA
Equipment Version               : 0100
Camera Type 2                   : E-PL6
Serial Number                   : V3PF01641
Internal Serial Number          : 4146312000064101
Focal Plane Diagonal            : 21.6 mm
Body Firmware Version           : 1.301
Lens Type                       : Olympus M.Zuiko Digital ED 40-150mm F4.0-5.6 R
Lens Serial Number              : ABJA20101
Lens Firmware Version           : 1.004
Max Aperture At Min Focal       : 4.0
Max Aperture At Max Focal       : 5.6
Min Focal Length                : 40
Max Focal Length                : 150
Max Aperture                    : 5.8
Lens Properties                 : 0xc140
Extender                        : None
Extender Serial Number          : 
Extender Model                  : 
Extender Firmware Version       : 0
Conversion Lens                 : 
Flash Type                      : None
Flash Model                     : None
Flash Firmware Version          : 0
Flash Serial Number             : 
Camera Settings Version         : 0100
Preview Image Valid             : Yes
Preview Image Start             : 4370283
Preview Image Length            : 486590
AE Lock                         : Off
Metering Mode                   : ESP
Exposure Shift                  : 0
Macro Mode                      : Off
Focus Mode                      : Single AF; S-AF, Imager AF
Focus Process                   : AF Used; 64
AF Search                       : Ready
AF Areas                        : (113,107)-(142,145)
AF Point Selected               : (49%,49%) (49%,49%)
AF Fine Tune                    : Off
AF Fine Tune Adj                : 0 0 0
Flash Mode                      : Fill-in, 2nd Curtain
Flash Exposure Comp             : 0
Flash Remote Control            : Off
Flash Control Mode              : Off; 0; 0; 0
Flash Intensity                 : n/a (x4)
Manual Flash Strength           : n/a (x4)
White Balance 2                 : Auto
White Balance Temperature       : Auto
White Balance Bracket           : 0 0
Custom Saturation               : 0 (min -5, max 5)
Modified Saturation             : Off
Contrast Setting                : 0 (min -5, max 5)
Sharpness Setting               : 0 (min -5, max 5)
Scene Mode                      : Standard
Noise Reduction                 : (none)
Distortion Correction           : Off
Shading Compensation            : Off
Compression Factor              : 4
Gradation                       : Normal; User-Selected
Picture Mode                    : Natural; 2
Picture Mode Saturation         : 0 (min -2, max 2)
Picture Mode Contrast           : 0 (min -2, max 2)
Picture Mode Sharpness          : 0 (min -2, max 2)
Picture Mode BW Filter          : n/a
Picture Mode Tone               : n/a
Noise Filter                    : Standard
Art Filter                      : Off; 0; 0; 0
Picture Mode Effect             : Standard
Tone Level                      : Highlights; 0; -7; 7; Shadows; 0; -7; 7; 0; 0; 0; 0
Art Filter Effect               : Off; 0; 0; Partial Color 0; No Effect; 0; No Color Filter; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0; 0
Drive Mode                      : Single Shot
Panorama Mode                   : Off
Image Stabilization             : On, Mode 1
Manometer Pressure              : 0 kPa
Manometer Reading               : 0 m, 0 ft
Extended WB Detect              : Off
Raw Dev Version                 : 0100
Raw Dev Exposure Bias Value     : 0
Raw Dev White Balance Value     : 0
Raw Dev WB Fine Adjustment      : 0
Raw Dev Gray Point              : 0 0 0
Raw Dev Saturation Emphasis     : 0 0 0
Raw Dev Memory Color Emphasis   : 0
Raw Dev Contrast Value          : 0 0 0
Raw Dev Sharpness Value         : 0 0 0
Raw Dev Color Space             : sRGB
Raw Dev Engine                  : High Speed
Raw Dev Noise Reduction         : (none)
Raw Dev Edit Status             : Original
Raw Dev Settings                : (none)
Image Processing Version        : 0112
WB RB Levels                    : 500 512 256 256
Color Matrix                    : 404 -116 -32 -62 368 -50 8 -116 364
Black Level 2                   : 255 255 255 255
Gain Base                       : 256
Crop Left                       : 8 0
Crop Top                        : 8 0
Crop Width                      : 4608
Crop Height                     : 3456
Sensor Calibration              : 4038 0
Noise Reduction 2               : (none)
Distortion Correction 2         : Off
Shading Compensation 2          : Off
Multiple Exposure Mode          : Off; 1
Aspect Ratio                    : 3:2
Aspect Frame                    : 0 0 4607 3071
Faces Detected                  : 0 0 0
Face Detect Area                : (Binary data 383 bytes, use -b option to extract)
Max Faces                       : 8 8 0
Face Detect Frame Size          : 0 0 0 0 0 0
Face Detect Frame Crop          : 0 0 0 0 0 0 0 0 0 0 0 0
Focus Info Version              : 0100
Scene Detect                    : 0
Zoom Step Count                 : 31
Focus Step Count                : 1130
Focus Step Infinity             : 0
Focus Step Near                 : 0
Focus Distance                  : 1.255 m
AF Point                        : Left (or n/a)
External Flash                  : Off
External Flash Bounce           : Bounce or Off
External Flash Zoom             : 0
Internal Flash                  : Off
Manual Flash                    : Off
Macro LED                       : Off
Sensor Temperature              : 36.8 C
User Comment                    : nocomment
Sub Sec Time                    : 123
Sub Sec Time Original           : 456
Sub Sec Time Digitized          : 789
Flashpix Version                : 0100
Color Space                     : sRGB
Exif Image Width                : 4608
Exif Image Height               : 3072
Interoperability Index          : R98 - DCF basic file (sRGB)
Interoperability Version        : 0100
File Source                     : Digital Camera
Custom Rendered                 : Normal
Exposure Mode                   : Manual
White Balance                   : Auto
Digital Zoom Ratio              : 1
Scene Capture Type              : Standard
Gain Control                    : Low gain up
Contrast                        : Normal
Saturation                      : Normal
Sharpness                       : Normal
Owner Name                      : 三山三山
Lens Info                       : 40-150mm f/4-5.6
Lens Model                      : OLYMPUS M.40-150mm F4.0-5.6 R
GPS Version ID                  : 2.3.0.0
GPS Altitude Ref                : Above Sea Level
GPS Time Stamp                  : 13:35:00
GPS Map Datum                   : CRS
GPS Date Stamp                  : 2020:04:23
Time Zone Offset                : -3 2
PrintIM Version                 : 0300
Compression                     : JPEG (old-style)
Thumbnail Offset                : 12326
Thumbnail Length                : 4175
Current IPTC Digest             : bc5c28da76274ca56a564a16bb61bb2b
Keywords                        : Schlüssel三山wort1, Schlüssel三山wort2
By-line                         : its me 三山, its 三山 you
By-line Title                   : Dev三山eloper
Contact                         : Ego三山 Michse
Writer-Editor                   : Ego三山 Michse
Application Record Version      : 4
Object Name                     : Be三山zeichnung des Objektes
Subject Reference               : subject:reference
Category                        : XXX
Special Instructions            : None三山
Time Created                    : 11:12:13+02:00
Sub-location                    : Motiv三山 in dieser Stadt
Province-State                  : Stadt三山 im Bundesland
Country-Primary Location Code   : GER
Country-Primary Location Name   : Deut三山schland
Copyright Notice                : Copy三山right 2020 by ME
Caption-Abstract                : Zu三山sammenfassung des Bildinhaltes
Copyright Flag                  : True
URL                             : http:byme.com
IPTC Digest                     : fe2570685b0e165bf2af73afaead142b
XMP Toolkit                     : Image::ExifTool 12.22
Country Code                    : de-DE
Creator City                    : Mü三山nchen
Creator Country                 : Bayern
Creator Address                 : mini三山 street 21
Creator Postal Code             : 81379
Creator Region                  : Oberbayern
Creator Work Email              : ego@web.de
Creator Work Telephone          : 123456
Creator Work URL                : http://hier.web.org
Intellectual Genre              : SUM
Location                        : genau三山 hier
Scene                           : 011900 Aktionsaufnahme
Subject Code                    : 14006000 Familie
Additional Model Information    : non三山e
Artwork Circa Date Created      : 2019
Artwork Content Description     : 三山content description
Artwork Contribution Description: co三山ntribution
Artwork Copyright Notice        : copy三山right by me 2019
Artwork Creator                 : Ego 三山 Michse
Artwork Creator ID              : 111三山
Artwork Copyright Owner ID      : 111
Artwork Copyright Owner Name    : Ego三山 Michse
Artwork Licensor ID             : 111
Artwork Licensor Name           : Ego Michse
Artwork Date Created            : 2020:05:25 10:10:00
Artwork Physical Description    : the m三山aterial
Artwork Source                  : made 三山 by me
Artwork Source Inventory No     : 123456
Artwork Source Inv URL          : http://my.domain.de
Artwork Style Period            : modern art三山
Artwork Title                   : a 三山short title
Digital Image GUID              : guid
Digital Source Type             : http://cv.iptc.org/newscodes/digitalsourcetype/digitalCapture
Event                           : Er三山eignis
Genre Cv Id                     : CvId
Genre Cv Term Id                : CvTermId
Genre Cv Term Name              : CvTerm三山Name
Genre Cv Term Refined About     : Cv三山About
Location Created City           : St三山adt
Location Created Country Code   : GER
Location Created Country Name   : Ba三山yern
Location Created Location Id    : my-id
Location Created Location Name  : Ho三山frunde
Location Created Province State : Bayern
Location Created Sublocation    : Hinterhof
Location Created World Region   : Europa
Location Created Identifier     : abc三山de
Location Shown City             : Stadt三山
Location Shown Country Code     : GER
Location Shown Country Name     : Ba三山yern
Location Shown Location Id      : my-id
Location Shown Location Name    : Hof三山runde
Location Shown Province State   : Bayern
Location Shown Sublocation      : Hin三山terhof
Location Shown World Region     : Euro三山pa
Location Shown Identifier       : ab三山cde
Max Avail Height                : 1000
Max Avail Width                 : 999
Model Age                       : 99
Organisation In Image Code      : image三山三山code
Organisation In Image Name      : name
Person In Image                 : Max三山imillian
Person In Image Cv Term Cv Id   : cvid
Person In Image Cv Term Id      : termid
Person In Image Cv Term Name    : term三山name
Person In Image Cv Term Refined About: about
Person In Image Description     : desc三山ription
Person In Image Id              : id
Person In Image Name            : na三山me
Product In Image Description    : Pro三山duct
Product In Image GTIN           : P-GTIN
Product In Image Name           : Product 三山 Name
Registry Item ID                : itemid
Registry Organisation ID        : orgi三山d
Contributor                     : Non三山e
Creator                         : Eg三山o Michse
Description                     : Beschreibung三山 wie Bildbeschreibung
Format                          : format
Rights                          : Copy三山right Ego Michse
Rights (de)                     : Alle三山 Rechte 2020 Ego Michse
Rights (en)                     : copyright三山 only
Subject                         : Schlüssel三山wort1, Schlüssel三山wort2
Title                           : Tite三山l wie Überschrift
GPS Differential                : No Correction
GPS Img Direction               : 13.1233998975934
GPS Img Direction Ref           : Magnetic North
Authors Position                : authors三山position:teacher
Caption Writer                  : caption三山writer
City                            : München三山
Country                         : Oberland三山
Credit                          : no credit
Date Created                    : 2020:01:02 11:10:09
Document Ancestors              : an三山cestors
Headline                        : Üb三山erschrift
Instructions                    : no instructions
Source                          : digital
State                           : Bayern三山
Transmission Reference          : ref
Urgency                         : 2
Adult Content Warning           : Not Required
Copyright Owner ID              : CO-ID
Copyright Owner Name            : CO-NA三山ME
Copyright Owner Image ID        : owne 三山-image-id
Copyright Registration Number   : re三山gistration-number
Copyright Status                : Protected
Credit Line Required            : Credit on Image
End User ID                     : userID
End User Name                   : userName
File Name As Delivered          : Name as deliverd
First Publication Date          : 2020:02:02
Image Alteration Constraints    : No Cropping
Image Creator ID                : icid
Image Creator Name              : ic三山name
Image Creator Image ID          : creat三山or-imageid
Image Duplication Constraints   : No Duplication Constraints
Image File Constraints          : Maintain File Name
Image File Format As Delivered  : Other
Image File Size As Delivered    : Up to 1 MB
Image Supplier ID               : IS-ID
Image Supplier Name             : IS-N三山AME
Image Supplier Image ID         : IS-IID
Image Type                      : Other
License End Date                : 2030:12:31
License ID                      : license id
License Start Date              : 2020:01:01
License Transaction Date        : 2020:01:01
Licensee ID                     : licensee-id
Licensee Name                   : licensee三山-name
Licensee Image ID               : licensee三山-image-id
Licensee Image Notes            : l-notes.l-notierung.l-notes
Licensee Project Reference      : referenc三山e1, reference2
Licensee Transaction ID         : trans三山action-id1, transaction-id2
Licensor City                   : l-city三山
Licensor Country                : l-country三山
Licensor Email                  : l-mail
Licensor Extended Address       : l-address三山
Licensor ID                     : l-id
Licensor Name                   : l-name三山
Licensor Postal Code            : l-pcode
Licensor Region                 : l-region
Licensor Street Address         : l-street三山
Licensor Telephone 1            : l-phone
Licensor URL                    : l-url
Licensor Image ID               : l-image-id
Licensor Notes                  : l-no三山te1.l-not三山ierung2
Minor Model Age Disclosure      : Age 25 or Over
Other Conditions                : other conditions
Other Image Info                : image info1.image-information2
Other License Info              : license info1.license information2
Other License Requirements      : req1.anforderung2
Reuse                           : Not Applicable
Terms And Conditions Text       : termsconditions
Terms And Conditions URL        : https://here.com
Creator Tool                    : creatortool
Label                           : Farb三山markierung
Metadata Date                   : 2020:05:24 00:00:00
Rating                          : -1
Document ID                     : 123三山abc
History Action                  : noaction
History Changed                 : nochange
History Instance ID             : id
History Parameters              : no params
History Software Agent          : agent
History When                    : 2020:05:24 00:00:00
Instance ID                     : instI三山DD
Original Document ID            : doc三山IDD
Certificate                     : 123abc456
Marked                          : True
Owner                           : ich三山
Usage Terms                     : usageterms
Web Statement                   : xxx
Image Width                     : 4608
Image Height                    : 3072
Encoding Process                : Baseline DCT, Huffman coding
Bits Per Sample                 : 8
Color Components                : 3
Y Cb Cr Sub Sampling            : YCbCr4:2:2 (2 1)
Aperture                        : 9.0
Blue Balance                    : 2
Image Size                      : 4608x3072
Megapixels                      : 14.2
Preview Image                   : (Binary data 486590 bytes, use -b option to extract)
Red Balance                     : 1.953125
Scale Factor To 35 mm Equivalent: 2.0
Shutter Speed                   : 1/640
Create Date                     : 2020:04:16 08:09:10.789+02:00
Date/Time Original              : 2019:01:19 11:51:16.456+02:00
Modify Date                     : 2019:01:19 11:51:16.123+02:00
Thumbnail Image                 : (Binary data 4175 bytes, use -b option to extract)
GPS Altitude                    : 578 m Above Sea Level
GPS Date/Time                   : 2020:04:23 13:35:00Z
GPS Latitude                    : 48 deg 12' 7.38" N
GPS Longitude                   : 11 deg 53' 46.37" E
Date/Time Created               : 2020:01:02 11:12:13+02:00
Extender Status                 : Not attached
GPS Latitude Ref                : North
GPS Longitude Ref               : East
Circle Of Confusion             : 0.015 mm
Depth Of Field                  : 0.017 m (1.247 - 1.263 m)
Field Of View                   : 6.0 deg (0.13 m)
Focal Length                    : 150.0 mm (35 mm equivalent: 300.5 mm)
GPS Position                    : 48 deg 12' 7.38" N, 11 deg 53' 46.37" E
Hyperfocal Distance             : 166.67 m
Lens ID                         : Olympus M.Zuiko Digital ED 40-150mm F4.0-5.6 R
Light Value                     : 14.7

Maik
Comment 19 herb 2021-06-19 12:49:17 UTC
Hello Maik,

Thanks to all for the investigations.

Of course it was tagged with an older version of Exiftool. It was done about 1 year ago. But I never heard that I used a faulty Exiftool.

In your output I see only IPTC-tags that are not displayed properly:
e.g.: Keywords, By-line, By-line Title or Contact     
As said in earlier comments: I guess option -charset iptc=utf8 is missing in the command you used.

As Gilles stated: IPTC:CodedCharacterSet is missing
But this tag is only recommended and not mandatory.
So in this case -charset for iptc is necessary.

But I can only guess, because I do not know the Exiftool command you used.

Thanks again
herb
Comment 20 caulier.gilles 2021-06-19 13:17:29 UTC
>I guess option -charset iptc=utf8 is missing in the command you used.

This will work in your case as you know that your IPTC is UFT8 encoded.

But it's not so far universal. IPTC can be encoded with other encoding. This si why there is a tag dedicated to identify the encoding for this container.

What's happen if another image is not encoded as UTF8 in IPTC. The decoding will be broken if we force UTF8 with ExifTool argument.

This is why IPTC encoding tag is highly recommended. It's identify which encoding is used in one image. This is what we call interoperability.

Gilles Caulier
Comment 21 herb 2021-06-20 07:43:30 UTC
(In reply to caulier.gilles from comment #20)
> >I guess option -charset iptc=utf8 is missing in the command you used.
> 
> This will work in your case as you know that your IPTC is UFT8 encoded.
> 
> But it's not so far universal. IPTC can be encoded with other encoding. This
> si why there is a tag dedicated to identify the encoding for this container.
> 
> What's happen if another image is not encoded as UTF8 in IPTC. The decoding
> will be broken if we force UTF8 with ExifTool argument.
> 
> This is why IPTC encoding tag is highly recommended. It's identify which
> encoding is used in one image. This is what we call interoperability.


Yes you are right.
But:
(1) Exiftool uses the following coding rule: (see FAQ 10):
The value of the IPTC:CodedCharacterSet tag determines how the internal IPTC string values are interpreted. If CodedCharacterSet exists and has a value of "UTF8" (or "ESC % G") then string values are assumed to be stored as UTF‑8. Otherwise the internal IPTC encoding is assumed to be Windows Latin1 (cp1252), but this can be changed with "-charset iptc=CHARSET". 

For me it is better to start Exiftool with -charset iptc=utf8 than using the above mentioned rule.

(2) In tab "IPTC" of metadata panel all tagsvalues are displayed properly; so Exiv2 must know that they are UTF8 encoded. Who gives this information?
Does Exiv2 really support all possible encoding given in CodedCharacterSet?

Best regards
herb
Comment 22 caulier.gilles 2021-06-20 08:01:09 UTC
Created attachment 139534 [details]
attachment-19675-0.html

Le dim. 20 juin 2021 à 09:43, herb <bugzilla_noreply@kde.org> a écrit :

> https://bugs.kde.org/show_bug.cgi?id=438888
>
> --- Comment #21 from herb <herb.k@web.de> ---
> (In reply to caulier.gilles from comment #20)
> > >I guess option -charset iptc=utf8 is missing in the command you used.
> >
> > This will work in your case as you know that your IPTC is UFT8 encoded.
> >
> > But it's not so far universal. IPTC can be encoded with other encoding.
> This
> > si why there is a tag dedicated to identify the encoding for this
> container.
> >
> > What's happen if another image is not encoded as UTF8 in IPTC. The
> decoding
> > will be broken if we force UTF8 with ExifTool argument.
> >
> > This is why IPTC encoding tag is highly recommended. It's identify which
> > encoding is used in one image. This is what we call interoperability.
>
>
> Yes you are right.
> But:
> (1) Exiftool uses the following coding rule: (see FAQ 10):
> The value of the IPTC:CodedCharacterSet tag determines how the internal
> IPTC
> string values are interpreted. If CodedCharacterSet exists and has a value
> of
> "UTF8" (or "ESC % G") then string values are assumed to be stored as UTF‑8.
> Otherwise the internal IPTC encoding is assumed to be Windows Latin1
> (cp1252),
> but this can be changed with "-charset iptc=CHARSET".
>
> For me it is better to start Exiftool with -charset iptc=utf8 than using
> the
> above mentioned rule.
>
> (2) In tab "IPTC" of metadata panel all tagsvalues are displayed properly;
> so
> Exiv2 must know that they are UTF8 encoded. Who gives this information?
> Does Exiv2 really support all possible encoding given in CodedCharacterSet?
>
> I think the string encoding is determined  by a content parser, if
possible.

But it's not guaranteed to work all the time...

Gilles Caulier
Comment 24 caulier.gilles 2021-06-20 08:09:39 UTC
Windows build will be computed and published tonight at usual place. It will include this last change.

Gilles Caulier
Comment 25 caulier.gilles 2021-06-21 05:13:05 UTC
Screenshot under Window 7 of ExifTool view with a directory using German umlauts. Look also the IPTC char encoding which is displayed properly.

https://i.imgur.com/Ehrk5gL.png

Gilles Caulier
Comment 26 Maik Qualmann 2021-06-21 05:56:50 UTC
Git commit e8e843c74bd7443a17a642a4972da08bc2c48ee3 by Maik Qualmann.
Committed on 21/06/2021 at 05:55.
Pushed by mqualmann into branch 'master'.

ExifTool use now UTF-8 file paths under Windows
FIXED-IN: 7.3.0

M  +0    -10   core/libs/metadataengine/exiftool/exiftoolparser_p.cpp

https://invent.kde.org/graphics/digikam/commit/e8e843c74bd7443a17a642a4972da08bc2c48ee3