385015 – Okular does not show XMP metadata for PDFs

Bug 385015 - Okular does not show XMP metadata for PDFs

Summary: Okular does not show XMP metadata for PDFs

Status:	CONFIRMED

Alias:	None

Product:	okular
Classification:	Applications
Component:	general (show other bugs)
Version:	unspecified
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Okular developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2017-09-24 08:44 UTC by Timo Kalliomäki
Modified:	2022-12-04 21:49 UTC (History)
CC List:	3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
A PDF file with title “An example of metadata in PDF” in both XMP and PDF info, and author “John Doe” in only the former (21.38 KB, application/pdf) 2017-09-24 17:59 UTC, Timo Kalliomäki	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Timo Kalliomäki 2017-09-24 08:44:13 UTC

For a PDF file with the XMP

...
<dc:title><rdf:Alt><rdf:li xml:lang="x-default">A document</rdf:li></rdf:Alt></dc:title>
<dc:creator><rdf:Seq><rdf:li>John Doe</rdf:li></rdf:Seq></dc:creator>
...

And the PDF info dictionary (with no /Author)

... /Title(A\040document) ...

Okular “properties” displays the title, but not the author. It would appear that the properties view only reads the PDF info, not the XMP metadata. This seems to be in contradiction with the PDF/A standard which states that in the presence of XMP metadata, the corresponding PDF info is optional and if used, must be the same as the XMP metadata.

Comment 1 Albert Astals Cid 2017-09-24 17:46:03 UTC

Please attach a pdf where such problem happens.

Comment 2 Timo Kalliomäki 2017-09-24 17:59:03 UTC

Created attachment 107991 [details]
A PDF file with title “An example of metadata in PDF” in both XMP and PDF info, and author “John Doe” in only the former

exiftool shows:
Title                           : An example of metadata in PDF
Creator                         : John Doe

Okular shows only the title, not the author.

Comment 3 Timo Kalliomäki 2017-09-24 18:00:13 UTC

Supplied the requested sample PDF.

Comment 4 Albert Astals Cid 2017-09-26 21:00:03 UTC

This needs work

Comment 5 Paul Millar 2022-12-04 21:49:39 UTC

At the risk of pointing out the obvious, XMP is a standard way of expressing metadata that may be embedded in more than just PDF. 

https://en.wikipedia.org/wiki/Extensible_Metadata_Platform

Of the file formats Okular supports, TIFF, JPEG, PNG, GIF and WebP support embedding XMP metadata.  In addition, XMP metadata might be provided as a sidecar metadata file; e.g., the file "my-file.xmp" for the main file "my-file.cbr".

To me, this suggests that a generic XMP-viewer might be useful, as it could support multiple files.  The PDF plugin would support extracting the XMP metadata, with other file-format-specific plugins would do likewise.

It might even make sense for this support to be library code, as other software needs XMP support.  As an example, the DigiKam project has (or claims to have) support for XMP:

https://userbase.kde.org/Digikam/Metadata

My own interest is in viewing the complete XMP metadata information, rather than a limited subsection of it.  (I think Timo is interested only in the Dublin Core terms.)  Viewing the XMP metadata simply as (pretty-printed) "raw" XML would probably be sufficient for me.