Bug 498808

Summary: Special character in output cuts rest of output
Product: [Applications] konsole Reporter: g111
Component: generalAssignee: Konsole Developer <konsole-devel>
Status: REPORTED ---    
Severity: normal CC: matan
Priority: NOR    
Version First Reported In: 23.08.5   
Target Milestone: ---   
Platform: Ubuntu   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Example text with a character that breaks the output

Description g111 2025-01-17 16:07:09 UTC
Created attachment 177465 [details]
Example text with a character that breaks the output

SUMMARY
I am using  e.g. "tail -f" to watch a logfile. If there is a special character in it (see the attached example file with the problematic char in the street field) the output stops with this character and nothing more is printed. The same happens when printing the file with "cat".

The konsole profile encoding is set to utf-8.

The special character originally is utf-8, but somehow, maybe when writing to the log file, utf-8 has become encoded wrong and the string is messed up. Nevertheless the output should be printed in the konsole.

As a workaround you can pipe the output to "cat -v". Or you can use xterm instead of konsole. Here the whole file is printed as expected.

STEPS TO REPRODUCE
1. save the attached file test-output.txt
2. open konsole and do "cat test-output.txt" or "tail -100 test-output.txt"

OBSERVED RESULT
The output stops after "Meine StraÃ"

EXPECTED RESULT
The whole file should be printed

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
(available in the Info Center app, or by running `kinfo` in a terminal window)
Linux/KDE Plasma: 
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Comment 1 g111 2025-01-17 16:10:14 UTC
I have missed the paragraph SOFTWARE/OS VERSIONS. So here is the information:

Operating System: Kubuntu 24.04
KDE Plasma Version: 5.27.11
KDE Frameworks Version: 5.115.0
Qt Version: 5.15.13
Kernel Version: 6.8.0-51-generic (64-bit)
Graphics Platform: X11
Graphics Processor: Mesa Intel® Xe Graphics
Comment 2 Matan Ziv-Av 2025-01-17 22:29:10 UTC
(In reply to g111 from comment #0)
> Created attachment 177465 [details]
> Example text with a character that breaks the output
> 
> SUMMARY
> I am using  e.g. "tail -f" to watch a logfile. If there is a special
> character in it (see the attached example file with the problematic char in
> the street field) the output stops with this character and nothing more is
> printed. The same happens when printing the file with "cat".
> 
> The konsole profile encoding is set to utf-8.
> 
> The special character originally is utf-8, but somehow, maybe when writing
> to the log file, utf-8 has become encoded wrong and the string is messed up.
> Nevertheless the output should be printed in the konsole.
> 
> As a workaround you can pipe the output to "cat -v". Or you can use xterm
> instead of konsole. Here the whole file is printed as expected.
> 
> STEPS TO REPRODUCE
> 1. save the attached file test-output.txt
> 2. open konsole and do "cat test-output.txt" or "tail -100 test-output.txt"
> 
> OBSERVED RESULT
> The output stops after "Meine StraÃ"
> 
> EXPECTED RESULT
> The whole file should be printed
> 
> SOFTWARE/OS VERSIONS
> Windows: 
> macOS: 
> (available in the Info Center app, or by running `kinfo` in a terminal
> window)
> Linux/KDE Plasma: 
> KDE Plasma Version: 
> KDE Frameworks Version: 
> Qt Version: 
> 
> ADDITIONAL INFORMATION

I am not sure this is a bug. This seems to be a doubly encoded data: You want U+00DF, which in UTF-8 is the byte sequence 0xc3 0x9f.

Instead you have the byte sequence 0xc3 0x83 0xc3 0x9f when decodes to U+00C3 U+009F.

U+00C3 is A tilda,

U+009F is a control char (APC), so everything after is considered part of the control sequence, rather than characters to be printed.

Please note that xterm and xfce4-terminal behave like konsole.