Bug 467328 - Okular mismanages fonts embedded in PDF document when printing
Summary: Okular mismanages fonts embedded in PDF document when printing
Status: ASSIGNED
Alias: None
Product: okular
Classification: Applications
Component: PDF backend (show other bugs)
Version: 22.12.2
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Okular developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-14 13:11 UTC by Sergio
Modified: 2023-03-16 14:11 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Test document (33.42 KB, application/pdf)
2023-03-14 13:12 UTC, Sergio
Details
Screenshot of print preview (95.86 KB, image/png)
2023-03-14 13:14 UTC, Sergio
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sergio 2023-03-14 13:11:39 UTC
I am encountering issues trying to use Okular because it visualizes documents correctly, but it corrupts them when printing.

This happens only with certain PDF documents that embed fonts. A typical case is represented by documents generated by Libreoffice using the Source Sans 3 OTF font from Adobe. When I open these PDF files in okular they display correctly. However, when I try to print them the font becomes unreadable. If I send the same file directly to the printer using the `lp` command, then it prints just fine.

Interestingly, also the *print preview* produced by Okular shows the font breakage.

I am attaching a test document to be used with the following steps to reproduce.

STEPS TO REPRODUCE
1. Open the test document, see it is shown correctly on screen
2. File->Print preview, in the preview the font is unreadable
3. File->Print, in the printout the font is unreadable

OBSERVED RESULT

In the print preview and in the printout the font is unreadable

EXPECTED RESULT

The print out should match what is displayed on screen

SOFTWARE/OS VERSIONS

Operating System: Manjaro Linux
KDE Plasma Version: 5.26.5
KDE Frameworks Version: 5.103.0
Qt Version: 5.15.8
Kernel Version: 6.1.12-1-MANJARO (64-bit)
Graphics Platform: X11
Processors: 8 × Intel® Core™ i7-4750HQ CPU @ 2.00GHz
Memory: 15.5 GiB of RAM
Graphics Processor: Mesa Intel® Iris® Pro Graphics P5200
Comment 1 Sergio 2023-03-14 13:12:43 UTC
Created attachment 157270 [details]
Test document
Comment 2 Sergio 2023-03-14 13:14:03 UTC
Created attachment 157271 [details]
Screenshot of print preview

From the screenshot it is quite evident that the document displays correctly (bottom window), but is not previewed correctly (window on top, where the text is unreadable). Printing the document produces what is shown in the preview window.
Comment 3 Sergio 2023-03-14 13:18:42 UTC
Qpdfview can print the PDF document just fine.
Comment 4 Oliver Sander 2023-03-14 13:24:00 UTC
Printing in Okular is a bit special.

I conjecture that you can print your document if you select the 'force rasterization' option in the print dialog.
Comment 5 Sergio 2023-03-14 17:46:42 UTC
@Oliver Sanders

> (In reply to Oliver Sander from comment #4)
> Printing in Okular is a bit special.
> 
> I conjecture that you can print your document if you select the 'force
> rasterization' option in the print dialog.

Your conjecture is correct.

Even if this workaround is effective, I think that this issue deserves attention:
- first, forcing rasterization does not fix the wrong preview;
- secondly, one typically works with force rasterization off (also because it can slow things down in some cases and I am not sure whether it can affect the printout quality in some cases). It is not nice when you discover that you have printed 200 bad pages that you should have forced rasterization;
- thirdly, in some occasions the issue is subtle. Here, I have managed crafting a PDF where all the characters get substituted by little squares when printing, which make the bug quite detectable. However, in some cases out of a multipage document you end up only missing a word on a single page or a few characters here and there. It is very easy to miss the issue and pass around an unprofessionally looking document or even signing an incomplete document;
- finally, due to the complexity of the PDF standard issues like this one tend to make one doubt the correctness of the original PDF. Is it OK or slightly out of specs, so that some viewer can print it and some other cannot? Unfortunately, this issue has already prompted the opening of inquires on the Libreoffice tracker and the Source Sans 3 font tracker.

Out of curiosity, in what sense printing in Okular is a bit special? In cases where rasterization is not explicitly required, cannot the PDF be passed to the printing subsystem more or less as is?
Comment 6 Oliver Sander 2023-03-15 09:34:45 UTC
I fully agree that `force rasterization` is only a workaround.

Okular currently converts pdf files to postscript and sends that to the printer (I forgot why exactly). Presumably it is the conversion step that goes wrong in your case.  If you want to have a look at the code: That's at `generator_pdf.cpp:1366`.

There used to be a patch that made Okular send the pdf file straight to the printer, but I can't seem to find it right now.

And then there's the official Qt way of printing: Render everything to a `QPrinter` object.  Code for that is at https://invent.kde.org/graphics/okular/-/merge_requests/411 , but that has its own set of problems.
Comment 7 Sergio 2023-03-15 10:54:40 UTC
(In reply to Oliver Sander from comment #6)
> I fully agree that `force rasterization` is only a workaround.
> 
> Okular currently converts pdf files to postscript and sends that to the
> printer (I forgot why exactly). Presumably it is the conversion step that
> goes wrong in your case.

Most likely the issue is indeed in this conversion.

Noticed that if you pre-process the PDF via a pdf-to-pdf conversion via Ghostscript (which most likely results in a simpler PDF) then the PDF prints fine. For sure the PDF to PDF conversion via ghostscript changes the way in which the fonts are embedded because the `pdffonts` utility returns different results.

I wonder if the conversion is related to the need to select page ranges or to do page scaling, but both things should be manageable at the PDF level. Or if it is needed by non-CUPS platforms (windows) where PDF might not be the "standard" way of describing the page for printing.

If you want to have a look at the code: That's at
> `generator_pdf.cpp:1366`.

> There used to be a patch that made Okular send the pdf file straight to the
> printer, but I can't seem to find it right now.

Resurrecting this patch might have good potential!

> And then there's the official Qt way of printing: Render everything to a
> `QPrinter` object.  Code for that is at
> https://invent.kde.org/graphics/okular/-/merge_requests/411 , but that has
> its own set of problems.

I had a look at it, but I am not sure I fully understand. If I get it correctly, the QPrinter object is an object you paint onto (sort of QPainter?). So if you have an application that wants to draw something meant to be printed you do that on the QPrinter object and in this way you get something that is printable. However, in case of a PDF document this seems a bit redundant and prone to be lossy:  first you paint the PDF on the QPrinter object and then the QPrinter object converts its content back to PDF for printing (at least on CUPS platforms). I understand that the set of problems you mention are indeed a consequence of the first conversion. Maybe if Qtpdfium grows a PDF->Qpainter rendering path that might overcome some of the limitations of the poppler rendering path, but still the QPrinter route appears to me as not the most efficient one. For complex PDFs it could also slow things down a bit. Is my understanding correct?
Comment 8 Sergio 2023-03-15 11:09:06 UTC
A few more questions:

1) Is the conversion to ps currently done by poppler? I see a Poppler::PSConverter object being used. If so, regardless of the conveniency of practicing the intermediate postscript conversion is there a poppler bug ultimately? Should an issue be opened with them?

2) when you ask for the rasterization, is the rasterization always happening at 300x300 dpi in unix regardless of the printer resolution? I see mention to a discussion at https://git.reviewboard.kde.org/r/130218/, but that site is dead.

Thanks again!
Comment 9 Sergio 2023-03-15 11:22:04 UTC
I have opened a request for feedback or for participation in the discussion on the poppler tracker: https://gitlab.freedesktop.org/poppler/poppler/-/issues/1377
Comment 10 Albert Astals Cid 2023-03-15 23:10:01 UTC
> Okular currently converts pdf files to postscript and sends that to the printer (I forgot why exactly).

Because thousands of years ago printing PDF files directly was not something that worked.
Comment 11 Sergio 2023-03-16 08:09:37 UTC
(In reply to Albert Astals Cid from comment #10)
> > Okular currently converts pdf files to postscript and sends that to the printer (I forgot why exactly).
> 
> Because thousands of years ago printing PDF files directly was not something
> that worked.

So, would resurrecting the patch mentioned by Oliver be the best option as of today?

As long as there is the intermediate postscript conversion,  do you thing that the observed behavior is indeed a poppler bug (i.e., that https://gitlab.freedesktop.org/poppler/poppler/-/issues/1377 is a bug?)
Comment 12 Bug Janitor Service 2023-03-16 12:41:34 UTC
A possibly relevant merge request was started @ https://invent.kde.org/graphics/okular/-/merge_requests/704
Comment 13 Oliver Sander 2023-03-16 14:11:06 UTC
1) Yes, poppler is used

2) On Windows, rasterization resolution follows the printer.  Elsewhere, 300dpi is hardcoded.