Bug 385015 - Okular does not show XMP metadata for PDFs
Summary: Okular does not show XMP metadata for PDFs
Status: CONFIRMED
Alias: None
Product: okular
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-24 08:44 UTC by Timo Kalliomäki
Modified: 2022-12-04 21:49 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
A PDF file with title “An example of metadata in PDF” in both XMP and PDF info, and author “John Doe” in only the former (21.38 KB, application/pdf)
2017-09-24 17:59 UTC, Timo Kalliomäki
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Timo Kalliomäki 2017-09-24 08:44:13 UTC
For a PDF file with the XMP

...
<dc:title><rdf:Alt><rdf:li xml:lang="x-default">A document</rdf:li></rdf:Alt></dc:title>
<dc:creator><rdf:Seq><rdf:li>John Doe</rdf:li></rdf:Seq></dc:creator>
...

And the PDF info dictionary (with no /Author)

... /Title(A\040document) ...

Okular “properties” displays the title, but not the author. It would appear that the properties view only reads the PDF info, not the XMP metadata. This seems to be in contradiction with the PDF/A standard which states that in the presence of XMP metadata, the corresponding PDF info is optional and if used, must be the same as the XMP metadata.
Comment 1 Albert Astals Cid 2017-09-24 17:46:03 UTC
Please attach a pdf where such problem happens.
Comment 2 Timo Kalliomäki 2017-09-24 17:59:03 UTC
Created attachment 107991 [details]
A PDF file with title “An example of metadata in PDF” in both XMP and PDF info, and author “John Doe” in only the former

exiftool shows:
Title                           : An example of metadata in PDF
Creator                         : John Doe

Okular shows only the title, not the author.
Comment 3 Timo Kalliomäki 2017-09-24 18:00:13 UTC
Supplied the requested sample PDF.
Comment 4 Albert Astals Cid 2017-09-26 21:00:03 UTC
This needs work
Comment 5 Paul Millar 2022-12-04 21:49:39 UTC
At the risk of pointing out the obvious, XMP is a standard way of expressing metadata that may be embedded in more than just PDF. 

https://en.wikipedia.org/wiki/Extensible_Metadata_Platform

Of the file formats Okular supports, TIFF, JPEG, PNG, GIF and WebP support embedding XMP metadata.  In addition, XMP metadata might be provided as a sidecar metadata file; e.g., the file "my-file.xmp" for the main file "my-file.cbr".

To me, this suggests that a generic XMP-viewer might be useful, as it could support multiple files.  The PDF plugin would support extracting the XMP metadata, with other file-format-specific plugins would do likewise.

It might even make sense for this support to be library code, as other software needs XMP support.  As an example, the DigiKam project has (or claims to have) support for XMP:

https://userbase.kde.org/Digikam/Metadata

My own interest is in viewing the complete XMP metadata information, rather than a limited subsection of it.  (I think Timo is interested only in the Dublin Core terms.)  Viewing the XMP metadata simply as (pretty-printed) "raw" XML would probably be sufficient for me.