Bug 511323

Summary: HTML export adds extra character and nulls for lines with CJK characters
Product: [Applications] konsole Reporter: gwdx
Component: generalAssignee: Konsole Bugs <konsole-bugs-null>
Status: RESOLVED FIXED    
Severity: minor    
Priority: NOR    
Version First Reported In: 25.04.2   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: In the HTML file, the line ends with an extra character.
Open the HTML file in an editor. An extra character and null characters are present in the file.

Description gwdx 2025-10-29 14:21:59 UTC
Created attachment 186294 [details]
In the HTML file, the line ends with an extra character.

SUMMARY
HTML export inserts extra character at line end and null characters between CJK characters.

STEPS TO REPRODUCE
1. Use the following command to print CJK characters:
  ```
  echo "中文测试行"
  ```
2. Use "Save Output As" to export the output as an HTML file.
3. Open the exported HTML file.

OBSERVED RESULT
The exported line ends with an extra character, and additional null characters are present between CJK characters in the HTML file.

EXPECTED RESULT
The line should end with no extra characters, and the file should not contain null characters.

SOFTWARE/OS VERSIONS
Linux: Debian GNU/Linux 13
KDE Plasma Version: 6.3.6
KDE Frameworks Version: 6.13.0
Qt Version: 6.8.2

ADDITIONAL INFORMATION
Comment 1 gwdx 2025-10-29 14:28:20 UTC
Created attachment 186295 [details]
Open the HTML file in an editor. An extra character and null characters are present in the file.
Comment 2 Bug Janitor Service 2025-10-29 14:49:35 UTC
A possibly relevant merge request was started @ https://invent.kde.org/utilities/konsole/-/merge_requests/1139
Comment 3 Kurt Hindenburg 2025-11-24 23:18:22 UTC
Git commit 33d0fa3b324fd6e244657bf8b2e01efe84fec343 by Kurt Hindenburg, on behalf of Wendi Gan.
Committed on 24/11/2025 at 23:18.
Pushed by hindenburg into branch 'master'.

Remove extra character and nulls in HTML export for lines with CJK

Issue:  
When exporting lines containing CJK characters, the HTML output may
contain an extra character at the end of the line and null
characters between CJK characters.

Change:
- Mark the position after a CJK character as the right half of a
  double-width character (null character).
- Ignore null characters when exporting to HTML.

M  +3    -1    src/Screen.cpp
M  +2    -0    src/decoders/HTMLDecoder.cpp

https://invent.kde.org/utilities/konsole/-/commit/33d0fa3b324fd6e244657bf8b2e01efe84fec343