Hello I am French so my tags may contain accented letters. But accented letters are displayed with a “?” in a diamond. See <Screenshot 1 - Tags with accented letters replaced by a diamond "?"> in the joined pdf When I look at Metadata, sometimes accented letters are correctly displayed but not always: - IPTC: Bad display (See <Screenshot 1 - Tags with accented letters replaced by a diamond "?">) - XMP: Correct display (See <Screenshot 2 – XMP displays correctly accented letters>) But when I use another tool (XnView MP) to watch Metadata on the same image, I don’t have this issue. See <Screenshot 3 & 4 – XnViewMP displays correctly accented letters> Remarks: - I asked the question if I miss some configuration here: https://discuss.kde.org/t/tags-with-accent-are-not-displayed-correctly/35307/1 . But I didn't receive any answer - I joined a zip containing a sample with accent in many field of metadata (Caption, Keywords, Object name and Supplemental category) - I can also provide a DNG with the same issue Thanks for your help
Created attachment 182312 [details] Archive with screenshots + a sample JPG
Hi Ludovic, There is no reason that IPTC non ascii characters are not properly displayed in digiKam. We support since a while the IPTC encoding tags indicating the right character-set used in the IPTC metadata. XMP has not this problem as all metadata are always encoded in UTF8. How have been generated the JPG file ? From a DNG image ? Which software have been used ? Best regards Gilles Caulier
Using digiKam 8.7.0 pre-release under MacOS, the IPTC wrong encoding char is reproducible. But in fact there is not IPTC tag about the encoding to respect. So digiKam use ASCII by default. From ExifTool : https://exiftool.org/TagNames/IPTC.html "CodedCharacterSet string[0,32]! (values are entered in the form "ESC X Y[, ...]". The escape sequence for UTF-8 character coding is "ESC % G", but this is displayed as "UTF8" for convenience. Either string may be used when writing. The value of this tag affects the decoding of string values in the Application and NewsPhoto records. This tag is marked as "unsafe" to prevent it from being copied by default in a group operation because existing tags in the destination image may use a different encoding. When creating a new IPTC record from scratch, it is suggested that this be set to "UTF8" if special characters are a possibility)" The last sentence from this tag description is clear : the tag must be included in the IPTC chunk, but it's not... Best regards Gilles Caulier
I can easily write accented characters to IPTC with digiKam (Exiv2 and ExifTool). I suspect that your IPTC characters were not encoded correctly by another program and that they are a Windows code page or something similar. That would explain why XnView displays them correctly under Windows. I'll investigate this further tonight. Maik
Hello Thanks all for your replies. I use https://geosetter.de/en/main-en/ for tagging my photos. I use the last available version. Unfortunately this application is no longer maintained and the beta version I use is about 2 years old. I understand that maybe there is an issue in the IPTC tag. I tried to find it with the command: > exiftool.exe -validate -warning -error -a "C:\Temp\Digikam_test_accents\20250530_crop.jpg" But ExifTool doesn't really complains: > Validate : 1 Warning (minor) > Warning : [minor] MakerNotes:PreviewImageStart is past end of file I have also tried several other tools to view image metadata: - ExifToolGui: https://github.com/FrankBijnen/ExifToolGui/releases/ - Metadata++: https://www.logipole.com/download.htm All of them managed to display non ascii letters. I don't know if it can help, but I noticed that ExifToolGui seems to use the argument "-CHARSET FILENAME=UTF8" when it retrieves IPTC metadata. Just for information, the full argument list is: "-echo4 {ready16} -CHARSET FILENAME=UTF8 -v0 -overwrite_original -sep * -c %.6f° -API WindowsWideFile=1 -API WindowsLongPath=1 -API GeoDir=C:\Multimedia\ExifToolGUI\GeoLocation500 -g0:1 -a -S -Iptc:All 20250530_crop.jpg -execute16"
I have just found a workaround to my main issue (the fact that tags with "?" are generated): In Settings / Configure Digikam / Metadata / Advanced / Tags, I disabled Iptc.Application2.Keywords
Here's the output from ExifTool in the Linux console (generally UTF-8). As you can see, the same result. Image Description : C�est au tournant des 14 et 15� si�cles que Louis, duc d�Orl�ans (1372-1407) entreprend la construction du ch�teau de Pierrefonds. Il est l�un des �difices les plus imposants et imprenables de son �poque. Partiellement d�truit au 17� si�cle, il est restaur� au 19� si�cle � la demande de Napol�on III par Viollet-le-Duc. The problem is that it's not pure ASCII or UTF-8, but Windows Code Page encoding. Windows Code Page encoding has no place in metadata, though. Your previous program made a mistake here. Maik
Ludovic, Following last comment from Maik, the idea will be to reencode all tags with ExifTool as you file are badly encoded with WCP, not UTF8 or ASCII. We cannot do anything here with digiKam, i fear... Best Gilles Caulier
Sorry for the very late answer. It took me some time to understand what happens because the Windows world can be very strange... Following this Exiftool Q&A https://exiftool.org/faq.html#Q18, I pass my command console into UTF8 (chcp 65001). Because without this, even IPTC oncoded in UTF8 was not displayed correctly in the console. Example with DSC02003.jpg that has been encoded in UTF8 and correctly displayed jn Digikam: >exiftool -iptc:all -charset filename=utf8 DSC02003.jpg >Coded Character Set : UTF8 >Date Created : 2013:08:04 >Time Created : 12:27:11+00:00 >Country-Primary Location Name : France >Country-Primary Location Code : FRA >City : Sainte-Gemme >Sub-location : La Ferme de Magn├® >Province-State : Nouvelle-Aquitaine >Keywords : France, La Ferme de Magn├®, Sainte-Gemme, Nouvelle-Aquitaine After this pre-requisit, I am now able to display correctly my pictures that has been encoded in latin1 Example with 20250530_P7387.jpg that is not correctly displayed jn Digikam: >exiftool -iptc:all -charset iptc=latin1 20250530_P7387.jpg >Keywords : France, Hauts-de-France, Pierrefonds, Ethan MARTIN, Frédérique MARTIN, Thais MARTIN >By-line : Ludovic Martin >Sub-location : Pierrefonds >Province-State : Hauts-de-France >Country-Primary Location Code : FRA >Country-Primary Location Name : France >Caption-Abstract : C’est au tournant des 14 et 15è siècles que Louis, duc d’Orléans (1372-1407) entreprend la construction du château de Pierrefonds. Il est l’un des édifices les plus imposants et imprenables de son époque. Partiellement détruit au 17è siècle, il est restauré au 19é siècle à la demande de Napoléon III par Viollet-le-Duc. >Application Record Version : 4 >Time Created : 12:52:33+02:00 >Object Name : Château de Pierrefonds After some more tests, I discover that I don't need to specify the charset: >exiftool -iptc:all 20250530_P7387.jpg This command provides the same result as the previous one (with "-charset iptc=latin1"). In fact, I finaly see in the exiftool documentation https://exiftool.org/exiftool_pod.html#Input-output-text-formatting, in the description of "-charset [[TYPE=]CHARSET]", that 'latin' is the default value for iptc when IPTC:CodedCharacterSet is not defined: >Other values of TYPE listed below are used to specify the internal encoding of various meta information formats. >TYPE Description Default >--------- ------------------------------------------- ------- >EXIF Internal encoding of EXIF "ASCII" strings (none) >ID3 Internal encoding of ID3v1 information Latin >IPTC Internal IPTC encoding to assume when Latin > IPTC:CodedCharacterSet is not defined >Photoshop Internal encoding of Photoshop IRB strings Latin >QuickTime Internal encoding of QuickTime strings MacRoman >RIFF Internal encoding of RIFF strings 0 Would it be possible that Digikam has the same defaulting than exiftool?
To complete my previous answer: In the Exiftool FAQ https://exiftool.org/faq.html#Q10, I have discovered the command to convert an image IPTC latin->UTF8: >exiftool -tagsfromfile @ -iptc:all -codedcharacterset=utf8 20250530_P7387.jpg But to be honest, I am a bit afraid to do a massive conversion on all my photos... So my previous question is still valid: Would it be possible that Digikam has the same defaulting than exiftool when IPTC:CodedCharacterSet is not defined?
Maik Can you consider to implement in DigiKam the same defaulting as in ExifTool fot IPTC metadata without charset (https://exiftool.org/exiftool_pod.html#Input-output-text-formatting) ? I use Luminar Neo to process my photos and I cannot force him to write IPTC in UTF8. It will really be painful if I have to "fix"(1) all pictures exported by Luminar. (1) It doesn't seem to be a bug, at least from the Exiftool's point of view. And, in fact, all tools I tested to display IPTC behaved like ExifTool. Many thanks for your help.
Changing the character set in ExifTool would only affect the ExifTool metadata viewer. The actual IPTC metadata internally and in the IPTC metadata viewer would not change, as we use Exiv2 internally. Maik