Bug 449942

Summary: Wrong ÄÖÜ umlauts displayed as Korean (UTF-problems?)
Product: [Applications] kate Reporter: Henning <boredsquirrel>
Component: plugin-previewAssignee: KWrite Developers <kwrite-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: hankangshuai
Priority: NOR    
Version First Reported In: 21.12.2   
Target Milestone: ---   
Platform: Kubuntu   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Screenshot of a test.md, couldnt reproduce the Korean letters.

Description Henning 2022-02-10 15:48:19 UTC
Created attachment 146538 [details]
Screenshot of a test.md, couldnt reproduce the Korean letters.

SUMMARY
***
NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols.
See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports
***


------- STEPS TO REPRODUCE
1. Open a markdown file in Kate
2. Add the preview plugin and under "View -> Tool-View(?)" activate Preview.
3. Start the preview

-------- OBSERVED RESULT
Bug Report 384972
also shows Umlauts and the sharp "ß" in an extremely weird way.

ä - "채"
ö - "철"
ü - "체"
ß - "ß"

when adding another {ä,ö,ü} with or without a space in between:
ää - "ä ä "
öö - "ö ö "
ü   - "Ã1⁄4"

now, when adding the second "ü", **ALL** are converted to the Korean signs from the beginning, but twice.
When writing a {ß,ö,ü,ä} in the row below (row 4), all the signs are being converted to the second form (not Korean).
When writing a regular letter, nothing happens, they stay korean.
When removing one of the second letters (ä or ö in this case) from the first two lines, they stay Korean too, so it only depends on row 3.
When writing another {ß,ö,ä,ü} next to one of the umlauts, they change to the second form too.
Adding breaks below doesnt change it.

I could find this german site about the problem, 
https://php-de.github.io/jumpto/utf-8/


-------- EXPECTED RESULT
markdown preview
normal UTF-8 symbols and consistent behavior

Operating System: Kubuntu 21.10
KDE Plasma Version: 5.24.0
KDE Frameworks Version: 5.90.0
Qt Version: 5.15.2
Kernel Version: 5.13.0-28-generic (64-bit)
Graphics Platform: X11

ADDITIONAL INFORMATION
probably a bug in Okular
couldnt reproduce the Korean symbols in a test.md created through Dolphin
Comment 1 Christoph Cullmann 2022-05-26 19:59:49 UTC
Git commit 5f423a11b4e45978daa68d019742561b03d83f20 by Christoph Cullmann.
Committed on 26/05/2022 at 19:58.
Pushed by cullmann into branch 'master'.

use UTF-8 encoding

markdown is a relative new format

I would assume the majority of markdown files are
either plain ansi or UTF-8

the old code did use the local encoding,
this leads to issues with unicode files
Related: bug 452659

M  +2    -0    src/markdownpart.cpp

https://invent.kde.org/utilities/markdownpart/commit/5f423a11b4e45978daa68d019742561b03d83f20
Comment 2 Christoph Cullmann 2022-06-17 18:19:40 UTC
*** Bug 453731 has been marked as a duplicate of this bug. ***