Summary: | If I cut -b1 arabic_file in konsole, the output displayed in screen is not correct | ||
---|---|---|---|
Product: | [Applications] konsole | Reporter: | Munzir Taha <munzirtaha> |
Component: | general | Assignee: | Konsole Developer <konsole-devel> |
Status: | RESOLVED FIXED | ||
Severity: | normal | ||
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Unlisted Binaries | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: | a sample arabic file |
Description
Munzir Taha
2004-06-12 21:42:54 UTC
Which encoding is this? The original arabic_file file is utf-8 $ file arabic_file arabic_file: UTF-8 Unicode text But since an Arabic character in utf-8 is two bytes, cutting the first byte won't generate something useful. Squares may mean not a valid utf-8 sequence but no leading spaces should be embedded anyway. Just curious, does this behavior occur using xterm or another terminal program? Also could you post the file you're using. I don't have the skills to fix it, but I was wondering if it still does this in KDE 3.3. Thanks Created attachment 8739 [details]
a sample arabic file
Yes, it also happens in xterm. I've just attached a sample arabic_file Fixed in Konsole for KDE 4 as a side effect of changing the way in which the incoming character stream is decoded for display. To clarify, the new behaviour when running cut -b1 on the above file is to print 4 blank lines (ie. invalid character sequences produce nothing at the output). |