Created attachment 141453 [details] userunit_10.pdf SUMMARY [PDF background] In the PDF format, coordinates are given in PDF points, where by default 1 point is equivalent to 1/72 of an inch (1in -> 2.54cm). However, PDFs can define custom units on a per-page basis, using the /UserUnit key. /UserUnit is a float or decimal that scales the default conversion fraction of 1/72, so for a /UserUnit of 10, 1pt would mean 10/72in. [What Okular does] It seems that Okular (like many other open-source PDF software) does not take /UserUnit into account for the displayed page size. The attached test document `userunit_10.pdf` defines a /UserUnit of 10. The document's /MediaBox looks like this: ```python3 [ Decimal('0.0'), Decimal('0.0'), Decimal('1785.6'), Decimal('1785.6') ``` Now the default conversion with 1pt -> 1/72in returns 630x630mm, which is what Okular displays. However, this is incorrect. In reality, the size is 6300mm, 10 times larger! (In particular, /UserUnit is used by Adobe Illustrator and possibly other PDF software to circumvent the maximum number of 14400pt imposed by Adobe Reader and some other PDF renderers.) STEPS TO REPRODUCE 1. Open the attached file in Okular 2. Go to File -> Properties 3. See the displayed page size 4. Inspect the document with the pikepdf python library, or and other PDF library of your choice 5. Print the /MediaBox and /UserUnit of page 0 OBSERVED RESULT Displayed page size is too small by factor 10. EXPECTED RESULT Displayed page size should always reflect the real page size and take /UserUnit into account. Operating System: KDE neon 5.22 KDE Plasma Version: 5.22.5 KDE Frameworks Version: 5.85.0 Qt Version: 5.15.3 Kernel Version: 5.11.0-34-generic (64-bit) Graphics Platform: Wayland
Python shell code to reproduce (replace TestFiles.userunit_10 with the path string where you saved the file, and skip the first import which depends on custom test infrastructure of the lib I am developing): ```python3 Python 3.8.10 (default, Jun 2 2021, 10:49:15) [GCC 9.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from tests_pdfnodegraph.testfiles import TestFiles >>> import pikepdf >>> pdf = pikepdf.Pdf.open(TestFiles.userunit_10) >>> page = pdf.pages[0] >>> page.MediaBox pikepdf.Array([ Decimal('0.0'), Decimal('0.0'), Decimal('1785.6'), Decimal('1785.6') ]) >>> page.UserUnit Decimal('10.0') >>> 1785.6 * 1/72 * 25.4 629.9199999999998 >>> 1785.6*10 * 1/72 * 25.4 6299.2 ```
Created attachment 141454 [details] userunit_screenshot
To clarify, I think it is not only the displayed size number that is incorrect, but also the space reserved for rendering the actual page: The screenshot I just added illustrates it better: The first page is from the userunit_10 file. The other 2 pages are ANSI A and A4 size, which is very roughly 200mm width - put one of the smaller pages three times next to each other, and it approximately matches the width of the larger page, although in fact it should be a lot larger - roughly thirty times the width of the smaller page!
> the space reserved for rendering the actual page or better formulated: the proportions of different pages to each other
Created attachment 141455 [details] Proportions pdf For you to confirm the UserUnit is set on the first page of the document in the screenshot, but not on the other pages. ```python3 >>> from tests_pdfnodegraph.pathtools import TestOutput >>> pdf = pikepdf.open(join(TestOutput,'out_14.pdf')) >>> page = pdf.pages[0] >>> page.UserUnit Decimal('10.0') >>> page_2 = pdf.pages[1] >>> page_2.UserUnit Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/manuel/.local/lib/python3.8/site-packages/pikepdf/_methods.py", line 1143, in __getattr__ return getattr(self.obj, name) AttributeError: /UserUnit >>> ```
List of other affected PDF software: * Chromium integrated PDF viewer (uses PDFium) * Firefox integrated PDF viewer (uses pdf.js) * Inkscape PDF importer (uses Poppler) * Scribus PDF importer * PDFStitcher (uses pikepdf) * PDF Arranger (uses pikepdf) * even the proprietary Master PDF Editor 4 and 5 Probably more ...
Created attachment 141456 [details] adobe_reader ... only Adobe Reader gets the proportions right
Do *not* add me to bugs. I don't understand what makes you think that is normal behaviour, but it's not, you're only making me ignore you.
Sorry. I just thought you'd be the maintainer of Okular, and wondered why you are not in the CC list, but apparently this has its reason. Sorry, really.
Can you reproduce the problem using one of the poppler command line tools like pdfinfo or pdftocairo? It may be a poppler bug.
Pdfinfo from poppler-utils does not show regular units like centimetres or inches, but it keeps the PDF points. Pdfinfo is a low-level tool that does not perform unit conversion on its own. However, it does not display the UserUnit value, so you could say it's somewhat wrong in the sense that it withholds information. So to judge who is at fault, it would be relevant to know how Okular obtains the displayed page size. Does it inspect CropBox/MediaBox and convert to units itself, or does it retrieve finished unit values from Poppler? In the first case, the source of the bug would be in Okular, in the second case it would be in Poppler.
I've searched a bit in the code, and at least the rendering proportions issue is Okular's fault I think: https://github.com/KDE/okular/blob/3a513f34b8bbba87bd96718dc96089e079578d55/generators/poppler/generator_pdf.cpp#L721
Another possibly relevant code passage: https://github.com/KDE/okular/blob/3a513f34b8bbba87bd96718dc96089e079578d55/generators/poppler/generator_pdf.cpp#L1303
(In reply to Oliver Sander from comment #10) > Can you reproduce the problem using one of the poppler command line tools > like pdfinfo or pdftocairo? It may be a poppler bug. `pdfinfo userunit_10.pdf` reports `Page size: 1785.6 x 1785.6 pts` (In reply to Manuel Geißer from comment #6) > List of other affected PDF software: > * Chromium integrated PDF viewer (uses PDFium) > * Firefox integrated PDF viewer (uses pdf.js) > * Inkscape PDF importer (uses Poppler) > * Scribus PDF importer > * PDFStitcher (uses pikepdf) > * PDF Arranger (uses pikepdf) > * even the proprietary Master PDF Editor 4 and 5 > > Probably more ... I think you should report at PDFium, pdf.js, Poppler, and pikepdf. Poppler is here: https://gitlab.freedesktop.org/poppler/poppler/issues It is the library used by Okular. There is also a muPDF backend for Okular. Did you try that? `mutool info userunit_10.pdf` reports `[ 0 0 17856 17856 ]`.
From the referenced code we can see that Okular uses the Poppler::Page::pageSizeF() function to obtain the page size: https://poppler.freedesktop.org/api/qt5/classPoppler_1_1Page.html#a598c287971839a113552176fc387ab30 This function is based on CropBox and returns points. What about the following solution: - the pageSize() and pageSizeF() functions should be changed to take /UserUnit into account, as the docs suggest the returned value is always given in 1/72in units - Additionally there should be some way to obtain the /UserUnit value with poppler. I couldn't find any such option in the documentation, though I only skimmed it.
> I think you should report at PDFium, pdf.js, Poppler, and pikepdf. Be careful - there are considerable differences between these libraries. I don't really know about pdf.js and PDFium, but pikepdf is quite low-level and does not provide a function to obtain page size on its own - this needs to be done downstream using CropBox/MediaBox, UserUnit, and Rotate. > There is also a muPDF backend for Okular. Did you try that? `mutool info userunit_10.pdf` reports `[ 0 0 17856 17856 ]`. Yes, I am aware that MuPDF directly takes /UserUnit into account. I noticed this during the tests for my lib (which also has a (Py)MuPDF rendering backend). How do I obtain the MuPDF backend for Okular, though? Is it possible that KDE Neon does not provide it? (I already have the okular-extra-backends package installed...)
> I think you should report at PDFium, pdf.js, Poppler, and pikepdf. I think it might be better if the Okular developers would report to Poppler, since I never used the Poppler library interface myself and thus don't have the required background.
> I think it might be better if the Okular developers would report to Poppler, > since I never used the Poppler library interface myself and thus don't have the > required background. I now filed an issue at Poppler nevertheless, as nobody else seems to have felt any responsibility to do so. The report essentially just references this thread, as it should contain all relevant information. https://gitlab.freedesktop.org/poppler/poppler/-/issues/1139 @OkularDevelopers: Please verify/confirm whether changing pageSize() and pageSizeF() would really be sufficient to fix the UserUnit issue.
> There is also a muPDF backend for Okular. Did you try that? The Ubuntu Focal mupdf package currently fails to open the file (https://bugs.launchpad.net/ubuntu/+source/mupdf/+bug/1943366). This likely is fixed in newer versions of mupdf or affects the MuPDF GUI only, though.
> There is also a muPDF backend for Okular. Did you try that? Is this at all still current? I checked out okular from https://invent.kde.org/graphics/okular.git and built with CMake, but couldn't find any hints on a MuPDF backend. `ls generators/` only shows ``` chm CMakeLists.txt comicbook djvu dvi epub fax fictionbook kimgio markdown mobipocket plucker poppler spectre tiff txt xps ```
`grep -r mupdf` on the Okular source tree doesn't find anything, either
No, the muPDF backend is not part of Okular, it is an independent project. Just search for it in the internet for okular-backend-mupdf or okular-mupdf-backend.
I guess you are referring to https://invent.kde.org/sandsmark/okular-mupdf-backend ? The thing is, there are multiple unofficial Okular MuPDF generators around... Moreover, why is this not officially part of Okular and not packaged in Debian, Ubuntu, and KDE Neon?
So I installed the dependencies and tried to build okular-mupdf-backend (from git master), but it fails with some "Variable not declared in this scope" error. Also there have been no commits to the repository since a year. Is this backend still functional?
Created attachment 142205 [details] (unrelated) okular-mupdf-backend build error
Can you guys please move the mupdf discussion elsewhere? While it is certainly interesting, it is only tangentially related to this bug.
Sure.