408222 – Okular is bad in displaying of scanned PDF ! Please fix this !

Bug 408222 - Okular is bad in displaying of scanned PDF ! Please fix this !

Summary: Okular is bad in displaying of scanned PDF ! Please fix this !

Status:	REPORTED

Alias:	None

Product:	okular
Classification:	Applications
Component:	PDF backend (other bugs)
Version First Reported In:	unspecified
Platform:	Other Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Okular developers

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-06-02 20:07 UTC by yousifjkadom
Modified:	2020-07-30 11:12 UTC (History)
CC List:	2 users (show)

See Also:	424817
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:

Attachments
scanned-pdf (3.29 MB, application/pdf) 2019-06-02 20:08 UTC, yousifjkadom	Details
Page 188 in Okular (62.77 KB, image/png) 2019-06-02 20:37 UTC, Laura David Hurka	Details
test-1 (289.30 KB, application/pdf) 2019-06-03 12:37 UTC, yousifjkadom	Details
test-2 (38.54 KB, application/pdf) 2019-06-03 12:38 UTC, yousifjkadom	Details
side by side comparison of test-1 rendered with Evince (l) and Okular (r) (541.17 KB, image/png) 2019-06-03 14:40 UTC, Tobias Deiminger	Details
Pixels (10.44 KB, image/png) 2019-06-04 20:20 UTC, Laura David Hurka	Details
test-1a-okular (280.32 KB, image/png) 2019-06-04 21:00 UTC, yousifjkadom	Details
test-1a-xreader (251.83 KB, image/png) 2019-06-04 21:01 UTC, yousifjkadom	Details
test-1b-okular (285.41 KB, image/png) 2019-06-04 21:01 UTC, yousifjkadom	Details
test-1b-xreader (300.20 KB, image/png) 2019-06-04 21:02 UTC, yousifjkadom	Details
test-2-okular (276.25 KB, image/png) 2019-06-04 21:03 UTC, yousifjkadom	Details
test-2-xreader (337.22 KB, image/png) 2019-06-04 21:05 UTC, yousifjkadom	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description yousifjkadom 2019-06-02 20:07:25 UTC

SUMMARY
Currently, Okular is bad in displaying scanned PDF. Pages of PDF appear as if of lowe DPI than their real values !

STEPS TO REPRODUCE
1. open any scanned PDF by Okular & look for a given page(s) within it.
2. compare the same page(s) using other PDF viewer like for example xviewer or evence.

OBSERVED RESULT
Scanned page(s) appear in Okular as if with lower dpi than reality. While in Evince scanned page(s) appear clearer & more enhanced !

EXPECTED RESULT
It should be same details as in Evince & other PDF viewers

SOFTWARE/OS VERSIONS
Fedora Linux 30 tested with both .rpm & flatpak packages of latest version of Okular (1.6)

ADDITIONAL INFORMATION
I tried all available enhancement options in PDF back-end but no any improvement !

I attached scanned PDF with bad scan to make issue more clear. Please open it in Okular then in other viewer & compare.

Comment 1 yousifjkadom 2019-06-02 20:08:50 UTC

Created attachment 120506 [details]
scanned-pdf

Comment 2 yousifjkadom 2019-06-02 20:10:18 UTC

Please take care for this issue because it is very important for those users that live in 3rd world countries where most of PDF books are scanned not formatted ....

Comment 3 Laura David Hurka 2019-06-02 20:37:39 UTC

Created attachment 120507 [details]
Page 188 in Okular

A friend just told me that this is readable.

However, how does it look different in Evince? It uses the same rendering backend, so theoretically it should look the same.

Comment 4 Albert Astals Cid 2019-06-02 21:42:35 UTC

(In reply to David Hurka from comment #3)
> However, how does it look different in Evince? It uses the same rendering
> backend

It does not use the same rendering backend.

Comment 5 Albert Astals Cid 2019-06-02 22:29:40 UTC

I don't know how to read arabic so it's kind of hard to me to say but looking at 
https://i.imgur.com/QvLAH99.png
evince and okular look as terrible to me.

Please attach a screenshot showing the problem.

Comment 6 yousifjkadom 2019-06-03 12:37:25 UTC

Hi.Sorry for delay in response, it was at night in my country & I went to sleep just after posting this bug.

Dear, I attach 2 PDF files, each composed from single page. In each page (each PDF file) there are:

- pictures
- Arabic characters
- English characters

If you can not evaluate Arabic, then please look carefully for both pictures & English paragraphs. Please concentrate on BLACK TONE.

If you concentrate well, you will noticed that black tone (DENSITY OF BLACK) is lower in Okular than Evence. So, picture in Okular less in dexterity than Evince.

Comment 7 yousifjkadom 2019-06-03 12:37:59 UTC

Created attachment 120518 [details]
test-1

Comment 8 yousifjkadom 2019-06-03 12:38:23 UTC

Created attachment 120519 [details]
test-2

Comment 9 Tobias Deiminger 2019-06-03 14:40:54 UTC

Created attachment 120522 [details]
side by side comparison of test-1 rendered with Evince (l) and Okular (r)

(In reply to yousifjkadom from comment #6)
> If you concentrate well, you will noticed that black tone (DENSITY OF BLACK)
> is lower in Okular than Evence. So, picture in Okular less in dexterity than
> Evince.

I'm with Albert, the scans look equally bad in both Evince and Okular (see attachment). Regarding black tone, I measured RGB values with KColorChooser and they are just the same.

Can you show how it looks different at your side? I mean don't just attach the PDF, but a screenshot showing it in Evince next to Okular.

Comment 10 Laura David Hurka 2019-06-04 20:20:31 UTC

Created attachment 120580 [details]
Pixels

Using Albert’s screenshot, one could say that Evince and Okular draw slightly different. Indeed, where the arrow points, Evince is sharper than Okular, resulting in a higher contrast and a differently perceived black tone.

Could have these reasons:
a) Both have a slightly different alignment of the scanned image, resulting in different sampling of rendered pixels which are not exactly on image pixels.
b) Both have same alignment, but use different filtering. (Bilinear, Bicubic, Nearest Neighbour,... In fact, Okular does not use nearest neighbour, causing smeared edges on scans. Maybe Evince does?)
c) this happened just at random for exactly these zoom factors.

Comment 11 yousifjkadom 2019-06-04 20:59:48 UTC

Take screenshots that you asked me for.

I have to add that NOT ONLY SCANNED PDF affected by this ! Even formatted PDF (text based PDF) also affected when black tone is lower in Okular & characters less sharp .......

Comment 12 yousifjkadom 2019-06-04 21:00:48 UTC

Created attachment 120581 [details]
test-1a-okular

Comment 13 yousifjkadom 2019-06-04 21:01:21 UTC

Created attachment 120582 [details]
test-1a-xreader

Comment 14 yousifjkadom 2019-06-04 21:01:57 UTC

Created attachment 120583 [details]
test-1b-okular

Comment 15 yousifjkadom 2019-06-04 21:02:50 UTC

Created attachment 120584 [details]
test-1b-xreader

Comment 16 yousifjkadom 2019-06-04 21:03:49 UTC

Created attachment 120585 [details]
test-2-okular

Comment 17 yousifjkadom 2019-06-04 21:05:32 UTC

Created attachment 120586 [details]
test-2-xreader

Comment 18 Albert Astals Cid 2019-06-08 09:43:14 UTC

user attached some screenshots

Comment 19 Albert Astals Cid 2019-06-08 09:44:04 UTC

By the way, i guess this is due to cultural barriers/translation issues, but your tone is way out of line.

Comment 20 Tobias Deiminger 2019-06-08 11:07:36 UTC

(In reply to yousifjkadom from comment #14)
> Created attachment 120583 [details]
> test-1b-okular

Can you tell us the resolution of your display, and the Okular zoom level at which you observe bad behavior? From screenshots I'd say it's 1366x768 display resolution, correct? But can't tell zoom level because of "fit width" setting.

I'm asking because poppler switches between two image scaling algorithm,
depending on whether upscaling or downscaling is needed. The embedded image in test-1.pdf has 300dpi. If you watch it on a 100dpi display at 100% zoom, it's actually scaled down (with bresenham in Okular/Splash, nearest neighbor in Evince/Cairo). If you do the same on a HiDPI display, the image is probably scaled up (bilinear interpolation in both Okular and Evince).

Scaling algorithm is the most obvious difference between Evince/xreader and Okular. But you say you see same bad rendering with text documents. That puzzles me, glyph rendering takes quite different paths.

I'd agree that screenshot test-1b-xreader.png looks slightly better than test-1b-okular.png. But that impression is subjective, can we try to quantize a bit what "better" means? E.g., we could measure "densitiy of black" in a histogram, right? Any other hints what "better" could mean here?

Comment 21 Laura David Hurka 2019-06-08 22:46:45 UTC

(In reply to Tobias Deiminger from comment #20)
> But that impression is subjective, can we try to
> quantize a bit what "better" means? E.g., we could measure "densitiy of
> black" in a histogram, right? Any other hints what "better" could mean here?

My first idea is to agree to using a histogram, and make an artifical reference “scan”, containing various edges at various sizes, angles, sharpnesses, resolutions,... and make a histogram to count how many pixels in the rendering output are used for edges.

Comment 22 Tobias Deiminger 2019-06-14 07:18:39 UTC

Found a quite on-topic paper [0], where it's suggested to compute local indexes for luminance comparison, contrast comparison and structural comparison to assess the quality of downsampling algorithm.

If I understand correctly, structural comparison refers to SSIM [1] which can be calculated using latest ImageMagick 7. For example:

$ compare -metric SSIM test-1-original.jpg test-1-cairo-200dpi.png null: 2>&1
0.826331

$ compare -metric SSIM test-1-original.jpg test-1-splash-200dpi.png null: 2>&1
0.824194

Higher is better, so cairo slightly wins in that structural comparison example.

For the test, test-1-cairo-200dpi.png and test-1-splash-200dpi.png have been generated locally from test-1.pdf using poppler_page_render and renderToImage respectively in a standalone application.

If no one disagrees (in the sense that observations wouldn't be meaningful enough or lack too much context), I'm going to follow this track and do broader tests with more metrics and try to gain better understand of the results.

[0] Korneta, 2017
[1] https://en.wikipedia.org/wiki/Structural_similarity

Comment 23 Tobias Deiminger 2019-06-14 19:20:13 UTC

(In reply to Tobias Deiminger from comment #22)
> $ compare -metric SSIM test-1-original.jpg test-1-cairo-200dpi.png null: 2>&1
> 0.826331
There was a mistake in the example. The compare tool needs two identical sized images (it works otherwise, but does a completely different thing then). So now I scale up poppler output back to original size with a high quality filter before comparing images.

Here are some first test results:

|       Test conditions    | avg. SSIM (higher is better) |
orig_size  new_size   scale  Splash    Cairo     Lanczos(*)
1894x2798  1799x2658  0.95   0.894696  0.97901   0.986598
1894x2798  1704x2518  0.90   0.940628  0.975257  0.961978
1894x2798  1609x2378  0.85   0.939805  0.972347  0.961621
1894x2798  1515x2238  0.80   0.896754  0.965689  0.972571
1894x2798  1420x2098  0.75   0.944005  0.955174  0.969092
1894x2798  1325x1958  0.70   0.944173  0.954734  0.966077
1894x2798  1231x1818  0.65   0.913747  0.959678  0.963164
1894x2798  1136x1678  0.60   0.932564  0.939862  0.959964
1894x2798  1041x1538  0.55   0.945498  0.926968  0.948855
1894x2798   947x1399  0.50   0.888512  0.951904  0.952485
1894x2798   852x1259  0.45   0.893747  0.944982  0.942065
1894x2798   757x1119  0.40   0.926194  0.938078  0.936967
1894x2798   6620x979  0.35   0.908621  0.928839  0.929847
1894x2798   568x 839  0.30   0.871821  0.9177    0.9235
1894x2798   473x 699  0.25   0.909859  0.903451  0.909088
1894x2798   378x 559  0.20   0.894681  0.88581   0.894793
1894x2798   284x 419  0.15   0.869878  0.879541  0.876719
1894x2798   189x 279  0.10   0.861054  0.858674  0.863767

If there were no further mistakes, that was a quite significant win for Cairo over Splash. I'm planning to publish my test methods and open an issue at poppler. There we can discuss if we take action.

(*) Lanczos downsampling (with IM7) is included for reference and plausibility.