Bug 359059 - FictionBook: Incorrect whitespace treatment
Summary: FictionBook: Incorrect whitespace treatment
Alias: None
Product: okular
Classification: Unclassified
Component: fictionbook backend (show other bugs)
Version: 21.12.0
Platform: Ubuntu Packages Linux
: NOR normal with 10 votes (vote)
Target Milestone: ---
Assignee: Okular developers
URL: http://pok.heliohost.org/unix/t.fb2
Depends on:
Reported: 2016-02-06 10:41 UTC by Sergio
Modified: 2022-01-04 23:10 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In: 22.04

test file t.fb2 (1.19 KB, application/x-fictionbook+xml)
2016-02-06 10:44 UTC, Sergio
A correct rendition by CoolReader (169.70 KB, image/png)
2016-02-06 10:53 UTC, Sergio
Screenshot (38.76 KB, image/png)
2020-11-08 10:55 UTC, Justin Zobel

Note You need to log in before you can comment on or make changes to this bug.
Description Sergio 2016-02-06 10:41:01 UTC
1. I could not find the "FictionBook Backend" in the menu. So I am writing to "general".
2. My Synaptic does not show any update for my "Version 0.19.3; Using KDE Development Platform 4.13.3" nor "Fiction Book Backend Version 0.1.5"; these version too are not available in the bugtracker menu.

3. THE PROBLEM. When presenting a Fictionbook file Okular does not collapse extra spaces between <p> and </p>; it does not treat line breaks in the source text as a space. As a result a structured source text is rendered with unwanted spaces and newlines. This is a deviation from XML Spec (Section 2.10: White Space).

4. Another problem is that the <text-author> element is silently ignored.

Reproducible: Always

Steps to Reproduce:
1. Please download a test book (1.2k):   http://pok.heliohost.org/unix/t.fb2
2. Open t.fb2 in Okular.

Actual Results:  
What I see with Okular is   http://pok.heliohost.org/unix/okular.png
Please note unwanted line break after "One"; unwanted spaces before "two" and "six".
Please note the missing attribution of the quotation.

Expected Results:  
The expected (correct) rendition (by CoolReader) is shown at

Basically, something like this:

| One two three.
| Four five six.
| Nine.
| Ten.
| ,----
| | In editing XML documents, it is often convenient to use “white
| | space” (spaces, tabs, and blank lines) to set apart the markup for
| | greater readability. Such white space is typically not intended
| | for inclusion in the delivered version of the document.
| `---- XML 1.0 Spec, 2.10 (White Space Handling)
Comment 1 Sergio 2016-02-06 10:44:02 UTC
Created attachment 97042 [details]
test file t.fb2
Comment 2 Sergio 2016-02-06 10:53:24 UTC
Created attachment 97043 [details]
A correct rendition by CoolReader
Comment 3 Justin Zobel 2020-11-08 10:55:52 UTC
Created attachment 133141 [details]

Confirmed on okular from git master. Layout is all wrong.

I believe the author details are now showing correctly though.
Comment 4 Justin Zobel 2020-11-08 10:56:48 UTC
Comment 5 soshial 2022-01-01 17:33:57 UTC
The bug still persists on v21.12.0.
Comment 6 Bug Janitor Service 2022-01-01 19:28:39 UTC
A possibly relevant merge request was started @ https://invent.kde.org/graphics/okular/-/merge_requests/529
Comment 7 Albert Astals Cid 2022-01-04 12:08:47 UTC
Git commit 3400e8aee65ef77eccebd2b7a8d4a65428b55010 by Albert Astals Cid, on behalf of Yuri Chornoivan.
Committed on 04/01/2022 at 12:08.
Pushed by aacid into branch 'master'.

Remove extra spaces in FB2 paragraphs

M  +1    -1    generators/fictionbook/converter.cpp