Bug 478729 - Does not show Text of Apple-edited PDFs
Summary: Does not show Text of Apple-edited PDFs
Status: RESOLVED UPSTREAM
Alias: None
Product: okular
Classification: Applications
Component: PDF backend (show other bugs)
Version: 24.01.80
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-19 15:51 UTC by dorla.hutch
Modified: 2024-02-10 10:41 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
one page of pdf without visible text (494.90 KB, application/pdf)
2023-12-19 15:51 UTC, dorla.hutch
Details
the same PDF page when Okular was used, not broken (943.14 KB, application/pdf)
2024-02-05 11:02 UTC, dorla.hutch
Details

Note You need to log in before you can comment on or make changes to this bug.
Description dorla.hutch 2023-12-19 15:51:17 UTC
Created attachment 164293 [details]
one page of pdf without visible text

SUMMARY
When the PDF is opened, the hand-written annotations (apparently from an Apple tablet device) are visible but not original text (all white).
Same happens with Firefox but it is different from Chrome or the renderer that Dolphin uses.

STEPS TO REPRODUCE
1. Annotate PDF with a modern apple tablet device
2. Open The PDF in Okular or Firefox

OBSERVED RESULT
Hand-written annotations are shown, everything else (text) not

EXPECTED RESULT
Hand-written annotations are shown with everything else

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Garuda Linux
(available in About System)
KDE Plasma Version: 5.27.10
KDE Frameworks Version: 5.113.0
Qt Version: 5.15.11

ADDITIONAL INFORMATION
Comment 1 Sune Vuorela 2023-12-20 08:16:29 UTC
Thank you for your report.

It would also be helpful if you could show a screenshot of the file shown on a device where it is shown as expected.

I'm also curious if you can provide the 'original' pdf before annotations so we have that for comparison.
Comment 2 Sune Vuorela 2023-12-20 19:04:45 UTC
I did investigate the file a bit, and it is a quite bad file. 

I don't know if it is so bad that we shouldn't be able to recover from it, but it looks like we at least don't fully do it.

This is a bit pdf-technical, but it is basically saying that now there is an object of 360 bytes and then the object is 473 bytes long and such things, and that's the most obvious errors.
Comment 3 dorla.hutch 2024-02-05 11:02:18 UTC
Created attachment 165561 [details]
the same PDF page when Okular was used, not broken

Neither of both annotated PDFs have annotations on this last page.
Comment 4 dorla.hutch 2024-02-05 11:04:47 UTC
Sorry for the delay in response.

I can open the broken PDF in chrome. The full PDF document would be shown correctly. Now, when I try it with the attached one, it would at least show how it looked like. The page originally contained a list of scientific references at the end of my master thesis proposal. I was unsure whether I am allowed to share it openly therefore I cut out the last page using PDF24. Apparently, Apple fucked it up because why PDF24 would cut out a page with such a garbled text? If Apple users want to use PDF24 in the web for editing tasks, they would be suprised.

I have added the last page of the same proposal PDF after it was edited with Okular by the other supervisor.
Comment 5 Albert Astals Cid 2024-02-10 10:41:44 UTC
FWIW Adobe Reader on Windows can't render that file either, it's really broken.

Can it be rendered? possibly mupdf seems to do a relatively good job recovering from the brokenness of the file.

Anyhow I am closing the bug because okular can't do anything here, if the file were to be rendered better/recovered from its brokenness it would be poppler (the component we use for rendering pdf files) responsability.

Please open a bug at https://gitlab.freedesktop.org/poppler/poppler/-/issues/new?issue and attach both the broken and the original file and if the poppler developers feel this is something they can fix/improve, they will.