425791 – Box characters in contents panel

Bug 425791 - Box characters in contents panel

Summary: Box characters in contents panel

Status:	RESOLVED UPSTREAM

Alias:	None

Product:	okular
Classification:	Applications
Component:	PDF backend (show other bugs)
Version:	1.7.3
Platform:	Ubuntu Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Okular developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-08-25 18:48 UTC by itisme1997
Modified:	2020-08-25 21:08 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Rendering with Okular 1.11 (poppler 0.89) (227.05 KB, image/png) 2020-08-25 19:04 UTC, Yuri Chornoivan	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description itisme1997 2020-08-25 18:48:56 UTC

SUMMARY

There are box characters in the contents panel of a pdf that is being rendered. pdftk does not extract any characters where the boxes are being displayed.

STEPS TO REPRODUCE
1. Open FoundationsOfMachineLearning_Mohri_Rostamizadeh_Talwalkar.pdf
2. Examine the contents panel.

OBSERVED RESULT
Chapter headings without suffixed boxes

EXPECTED RESULT
Some chapter headings with suffixed boxes.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: Ubuntu 19.10 5.3.0-64-generic
KDE Frameworks Version: 5.62.0
Qt Version: 5.12.4

ADDITIONAL INFORMATION
`pdftk FoundationsOfMachineLearning_Mohri_Rostamizadeh_Talwalkar.pdf dump_data | grep BookmarkTitle | cat -vTE` produces does not show characters after the section headers.
The file renders correctly in the firefox pdf renderer and evince.

Comment 1 Yuri Chornoivan 2020-08-25 19:04:40 UTC

Created attachment 131179 [details]
Rendering with Okular 1.11 (poppler 0.89)

No problems here. Can you give us an address for your file?

Thanks in advance for your answer.

Comment 2 Yuri Chornoivan 2020-08-25 19:05:23 UTC

Change status.

Comment 3 itisme1997 2020-08-25 19:11:54 UTC

I can't upload the file, as it's too big. In attempting to create a smaller file with pdftk, I discovered that extracting and updating the info fixes the issue.

pdftk $DOCUMENT.pdf dump_data > in.info
pdftk $DOCUMENT.pdf update_info in.info output out.pdf

There is a diff between these documents, but I don't know why. The diff is also too large to upload.

Comment 4 itisme1997 2020-08-25 19:12:41 UTC

Here is a link to the pdf.

https://www.dropbox.com/s/7voitv0vt24c88s/10290.pdf?dl=1

Comment 5 Yuri Chornoivan 2020-08-25 19:22:40 UTC

Boxes can be seen in Okular, but invisible in Evince.

Comment 6 Albert Astals Cid 2020-08-25 21:08:02 UTC

I would argue the PDF is actually broken, it just has NULL characters in the text, but since poppler was already clearing 1 NULL character from the end if it was there, i've changed it to clear as many NULL characters from the end of strings

https://gitlab.freedesktop.org/poppler/poppler/-/merge_requests/619