Bug 449942 - Wrong ÄÖÜ umlauts displayed as Korean (UTF-problems?)
Summary: Wrong ÄÖÜ umlauts displayed as Korean (UTF-problems?)
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: plugin-preview (show other bugs)
Version: 21.12.2
Platform: Kubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
: 453731 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-02-10 15:48 UTC by Henning
Modified: 2022-06-17 18:19 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Screenshot of a test.md, couldnt reproduce the Korean letters. (47.96 KB, image/png)
2022-02-10 15:48 UTC, Henning
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Henning 2022-02-10 15:48:19 UTC
Created attachment 146538 [details]
Screenshot of a test.md, couldnt reproduce the Korean letters.

SUMMARY
***
NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols.
See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports
***


------- STEPS TO REPRODUCE
1. Open a markdown file in Kate
2. Add the preview plugin and under "View -> Tool-View(?)" activate Preview.
3. Start the preview

-------- OBSERVED RESULT
Bug Report 384972
also shows Umlauts and the sharp "ß" in an extremely weird way.

ä - "채"
ö - "철"
ü - "체"
ß - "ß"

when adding another {ä,ö,ü} with or without a space in between:
ää - "ä ä "
öö - "ö ö "
ü   - "Ã1⁄4"

now, when adding the second "ü", **ALL** are converted to the Korean signs from the beginning, but twice.
When writing a {ß,ö,ü,ä} in the row below (row 4), all the signs are being converted to the second form (not Korean).
When writing a regular letter, nothing happens, they stay korean.
When removing one of the second letters (ä or ö in this case) from the first two lines, they stay Korean too, so it only depends on row 3.
When writing another {ß,ö,ä,ü} next to one of the umlauts, they change to the second form too.
Adding breaks below doesnt change it.

I could find this german site about the problem, 
https://php-de.github.io/jumpto/utf-8/


-------- EXPECTED RESULT
markdown preview
normal UTF-8 symbols and consistent behavior

Operating System: Kubuntu 21.10
KDE Plasma Version: 5.24.0
KDE Frameworks Version: 5.90.0
Qt Version: 5.15.2
Kernel Version: 5.13.0-28-generic (64-bit)
Graphics Platform: X11

ADDITIONAL INFORMATION
probably a bug in Okular
couldnt reproduce the Korean symbols in a test.md created through Dolphin
Comment 1 Christoph Cullmann 2022-05-26 19:59:49 UTC
Git commit 5f423a11b4e45978daa68d019742561b03d83f20 by Christoph Cullmann.
Committed on 26/05/2022 at 19:58.
Pushed by cullmann into branch 'master'.

use UTF-8 encoding

markdown is a relative new format

I would assume the majority of markdown files are
either plain ansi or UTF-8

the old code did use the local encoding,
this leads to issues with unicode files
Related: bug 452659

M  +2    -0    src/markdownpart.cpp

https://invent.kde.org/utilities/markdownpart/commit/5f423a11b4e45978daa68d019742561b03d83f20
Comment 2 Christoph Cullmann 2022-06-17 18:19:40 UTC
*** Bug 453731 has been marked as a duplicate of this bug. ***