SUMMARY okular failes to render striked through text (using ~~ ... ~~) in markdown formatted documents, when it also have some characters, that are converted by https://www.pell.portland.or.us/~orc/Code/discount/ into html entities such as … or © Might affect all https://www.pell.portland.or.us/~orc/Code/discount/#smartypants We has a discussion already here: https://www.reddit.com/r/kde/comments/1o86fz9/kate_and_kwrite_know_and_can_display_markdown/ OBSERVED RESULT strike through text is not rendered as such EXPECTED RESULT strike through text is rendered as such SOFTWARE/OS VERSIONS Operating System: Fedora Linux 42 KDE Plasma Version: 6.4.5 KDE Frameworks Version: 6.19.0 Qt Version: 6.9.2 Kernel Version: 6.16.12-200.fc42.x86_64 (64-bit) Graphics Platform: Wayland ADDITIONAL INFORMATION https://imgbox.com/g/63LQenR1jY https://imgbox.com/51ZHNRej (the better one)
*** Bug 510944 has been marked as a duplicate of this bug. ***
I think I see where this is going wrong. It's this line: https://invent.kde.org/graphics/okular/-/blob/420d551a9fe9c2904f67c6b47efdde7b9e4faa98/generators/markdown/converter.cpp#L54 QDomDocument::setContent() parses XML. Discount emits named character references which are valid HTML but *not* valid XML. This can be demonstrated by inserting the line qDebug() << dom.setContent(html).errorMessage; and then loading a file containing an ellipsis. Okular prints "Entity 'hellip' not declared."
> and then loading a file containing an ellipsis. Okular prints "Entity > 'hellip' not declared." nice finding! thx for digging into it
A possibly relevant merge request was started @ https://invent.kde.org/graphics/okular/-/merge_requests/1270
I've put in an MR for, as it were, the stupid solution: just fix the tags with a plain-text find and replace. I came across a couple of alternative solutions which seemed complex beyond my comfort level, and also possibly bad ideas anyway: 1. Something a bit like this: https://invent.kde.org/education/rkward/-/commit/4ea710a77a90f1329ab57661283495bffdffe42c#6f1ea88f82d81036331f50a8ce6c32e36556e4e0 2. I *thought* QDomDocument offered a way of declaring additional entity references, or of handling undeclared ones, but a) I can't actually find it any more and b) that feels a bit too much like trying to build an HTML parser on top of the XML parser.
A possibly relevant merge request was started @ https://invent.kde.org/graphics/okular/-/merge_requests/1275
Git commit 6566838bb259622f023476af17f753ae4a9b3530 by Sune Vuorela, on behalf of Ben Morris. Committed on 14/11/2025 at 09:36. Pushed by sune into branch 'master'. Do not process HTML with QDomDocument `QDomDocument` was used to replace `<del>` tags in Discount's output with `<s>` tags. `QDomDocument::setContent()` parses XML only. This usually works, because Discount's HTML is usually valid XML. However, since it is in "Smartypants" mode, Discount generates named character references in response to certain inputs, e.g. `(c)` -> `©` and `...` -> `…`. These are valid HTML, but most are not predefined in standard XML, and so QDomDocument refuses to parse them. This MR uses `QString::replace()` in place of `QDomDocument`. I know it's generally frowned upon to process HTML by such simple approaches, but within the constraints of HTML which Discount generates, I can't see a way that this could go wrong. M +6 -18 autotests/markdowntest.cpp M +4 -28 generators/markdown/converter.cpp https://invent.kde.org/graphics/okular/-/commit/6566838bb259622f023476af17f753ae4a9b3530
Git commit 2080ad79ab08d17c5f7f244bad36c108a69bd7f1 by Sune Vuorela. Committed on 14/11/2025 at 09:38. Pushed by sune into branch 'release/25.12'. Do not process HTML with QDomDocument `QDomDocument` was used to replace `<del>` tags in Discount's output with `<s>` tags. `QDomDocument::setContent()` parses XML only. This usually works, because Discount's HTML is usually valid XML. However, since it is in "Smartypants" mode, Discount generates named character references in response to certain inputs, e.g. `(c)` -> `©` and `...` -> `…`. These are valid HTML, but most are not predefined in standard XML, and so QDomDocument refuses to parse them. This MR uses `QString::replace()` in place of `QDomDocument`. I know it's generally frowned upon to process HTML by such simple approaches, but within the constraints of HTML which Discount generates, I can't see a way that this could go wrong. (cherry picked from commit 6566838bb259622f023476af17f753ae4a9b3530) 9acf9eeb Do not process HTML with QDomDocument 1337effa Removed wrapper tags from Markdown tests 8238e4e7 Whitespace-only changes to some Markdown tests Co-authored-by: Ben Morris <bugs@benmorris.org.uk> M +6 -18 autotests/markdowntest.cpp M +4 -28 generators/markdown/converter.cpp https://invent.kde.org/graphics/okular/-/commit/2080ad79ab08d17c5f7f244bad36c108a69bd7f1
(In reply to Bug Janitor Service from comment #4) > A possibly relevant merge request was started @ > https://invent.kde.org/graphics/okular/-/merge_requests/1270 I ended up being convinced it is an improvement so I merged it. Note that it just missed the 25.12 beta cutoff, but will be in the RC.