271755 – page labels are broken

Bug 271755 - page labels are broken

Summary: page labels are broken

Status:	RESOLVED DUPLICATE of bug 187237

Alias:	None

Product:	okular
Classification:	Applications
Component:	general (show other bugs)
Version:	0.12.1
Platform:	Fedora RPMs Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Okular developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2011-04-26 11:12 UTC by Evert Mouw
Modified:	2011-05-09 07:42 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
example: pdf containing different page labels (dutch) (625.91 KB, application/x-pdf) 2011-04-27 20:48 UTC, Evert Mouw	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Evert Mouw 2011-04-26 11:12:03 UTC

Version: 0.12.1 (using KDE 4.6.0)
OS: Linux

Full article with screenshots:
http://techmonks.net/pdf-support-for-multiple-page-numbering-styles

You want your book to use Roman numbers for the first few pages, and Indian (often called “Arabic”, although the Indians invented them) numerals for the rest of your book? PDF supports multiple page numbering styles, even in the same document. This was added since PDF version 1.3, and can be found in the 1.7 specification under “12.4.2 Page Labels”.

Each page in a PDF document shall be identified by an integer page index that expresses the page’s relative position within the document. In addition, a document may optionally define page labels (PDF 1.3) to identify each page visually on the screen or in print. Page labels and page indices need not coincide: the indices shall be fixed, running consecutively through the document starting from 0 for the first page, but the labels may be specified in any way that is appropriate for the particular document.

Unfortunately, not all PDF readers follow the specs. I have tried a few common readers under Windows, all of them recently updated. Acrobat Reader X and Nitro PDF Reader 1.4 (beta) do great. Sumatra 1.5 and PDF-Xchange_2.5 fail miserably.

Linux applications show the same variety. I used ISO images from the latest Fedora 15 distribution. Evince 2.91, the default PDF viewer when using Gnome, did great. But for KDE, Okular 0.12.1 failed badly.

Reproducible: Always

Steps to Reproduce:
Open a PDF that contains different page numbering symbols (page label ranges).

Comment 1 Pino Toscano 2011-04-27 11:07:47 UTC

Part of this is already covered by bug #187237. What else is left out of it?

Would be nice if you could actually list the issues, not generic "page labels are broken" and "I tried okular and it failed".

Comment 2 Evert Mouw 2011-04-27 19:57:41 UTC

I mentioned the specific part of the PDF spec that Okular is violating. I guess that is issue enough. If it isn't, that's fine with me, I just continue using a PDF viewer that follows the specs.

Comment 3 Albert Astals Cid 2011-04-27 20:13:55 UTC

Please attach a pdf file that shows the problem.

Comment 4 Evert Mouw 2011-04-27 20:48:41 UTC

Created attachment 59375 [details]
example: pdf containing different page labels (dutch)

A Dutch PDF document showing different page labels. Note that the first three pages are Roman numerals (i, ii, ii), then followed by Indian (Arabic) numerals 1, 2, 3, ...

Comment 5 Davor Cubranic 2011-04-28 22:53:48 UTC

Evert, it would be helpful if in your bug summary and description you said what it is you think should be changed. It is not strictly true that Okular or other PDF readers mentioned in your blog post "don't follow the specs" -- they simply display the page *index* in the UI. If you propose that the UI used to navigate to pages instead use the page *label*, then please say so. (And if that's indeed the case, then see bug 231000, which has already asked the same thing.)

Comment 6 Evert Mouw 2011-04-30 02:13:20 UTC

@Davor: the page index starts with zero (o), by definition, so Okular and others do not display the page index (but might display the page index plus one). Either way, they do not follow the specs. Your interpretation to use page labels for page navigation would be idential to Adobe's implementation, and would IMHO be correct behaviour for any PDF viewer.

Comment 7 Davor Cubranic 2011-05-08 03:02:35 UTC

Thanks for the clarification, Evert. This can be marked a duplicated of bug 187237.

Comment 8 Evert Mouw 2011-05-08 10:54:10 UTC

Agreed, first I thought 187237 was about OCR (PageLabels were not mentioned) but it seems indeed a duplicate. I will mention the discussion here for readers of bug 187237.

*** This bug has been marked as a duplicate of bug 187237 ***

Comment 9 Dotan Cohen 2011-05-08 11:33:34 UTC

Bug 187237 asks for user-selectable page numbering, not automatic as this bug requests. I argue that this bug is therefore not a dupe of bug 187237.

Comment 10 Dotan Cohen 2011-05-08 11:34:39 UTC

Additionally, Okular is not violating the PDF spec. Okular is not implementing the full spec. There is a difference between violating the spec and not implementing part of it.

Comment 11 Davor Cubranic 2011-05-09 07:42:18 UTC

Dotan, please read comment #6 about the adherence to the spec.

As for the duplication issue, it's not my call, but here is how I see it: displaying page labels and navigating ("go-to") by page label are not the same, but they are related issues. (And Evert brings both up in this bug.) One could track them separately, but until someone actually expresses an inclination to work on them, we may as well keep them together for simplicity sake. But in that case the summary (at least) should reflect that there are multiple aspects of the bug.