(*** This bug was imported into bugs.kde.org ***) On Thursday 12 October 2000 05:01 yangbt@legend.com.cn wrote: > When I selected a iso10646-1 font which includes CJK glyph the CJK > characters can be show in konsole rightly but the ASCII characters are > shown doouble width as they should be.(So "root" is shown as "r o o t"). Yang konsole is using a fixed width grid so that characters are placed properly under each other. A likely problem appears when one uses a variable width font. Now that you try to use CJK in the konsole i'd love to learn how it should be. I'd be glad if you can explain it to me or make contact with anyone who uses CJK regularly on an X terminal. Regards Lars
I wouldn't say I'm a regular CJK user, and I'm not sure what the "proper behavior" is, if it is even defined, but I've noticed this on konsole as well. My LANG=en_US.UTF-8. Konsole works great with Latin-X characters, and even Katakana, but when I try to display CJK ideographs or Tibetan characters, extra spacing is inserted horizontally between the characters. I really REALLY am no expert in this, but since KWord doesn't insert spacing, I don't believe Konsole should (okay, I know Konsole has the grid to contend with). Anyway, the function wcwidth() can be used to determine how many columns a wide character takes up when displayed on the console. I figure it should go like this: ABCD??EFG Where "??" is a single CJK ideograph that has a wcwidth of 2. Right now, it looks more like ABCD ?? EFG And this is the interesting part, when you SELECT CJK text, the highlighted characters are left-shifted so that they take up what I would consider to be the correct amount of columns. So if you can figure out what's going on there, it may already be coded! I'll try to create an attachment of some random wide characters for you to play with.
Created attachment 1968 [details] Some random Tibetan characters (UTF-8) These Tibetan characters should take only one or two columns, but they appear to take four apiece. Also of note: Kwrite also displays the extra spaces, KWord does not. So this may be an issue with all monospacing. It doesn't explain why konsole behaves differently when selecting text, however. Selecting text in KWrite does not behave this way.
Which version of KDE are you using?
Current released version (I'm at work now--can't check). I've seen this on RedHat and SuSE, both out-of-the-box and updated-to-current. I thought about this last night, and I think I have an explanation of why this is happening: When KDE is determining how much room a character takes up using a monospaced font, it is setting aside width based on the number of BYTES comprise the character. So since I'm using UTF-8, one-byte single-width characters look okay, and two-byte double-width characters also look okay. But four byte characters are always wrong. The original reporter was using an encoding where all characters are two bytes (I think) and that's why all characters took two columns. If this is right, this is actually a pretty serious bug in our Unicode support. Search bugs.kde.org for comments containing "wcwidth" and you'll see another bug in KMail where KMail wraps Japanese messages incorrectly because it's not getting the character width right. If you consider that, with combining diacritical marks, a single character can actually be three or four wide characters, which can each be up to four bytes long, this could get ugly! Then again, maybe I'm out of my depth. I'm very new to Unicode programming frankly.
konsole reserves space based on the result of wcwidth for the given unicode-character. The original encoding of the character plays no role whatsoever. Konsole's CJK handling was broken in KDE 3.1.2 but should be better again in the upcoming KDE 3.1.3. So the problems that you experience may be due to that, or they may be caused by some other, probably font-related, problem. How many characters are their supposed to be in the attachment that you created?
The attachment should contain 14 Tibetan characters. I'd be interested to see how KWrite displays the file in KDE 3.1.3-CVS. If it looks okay on your end (and if you'd prefer if I chose a slightly more common language, let me know what multibyte languages you can display) I'll consider that good enough to wait and see how KDE 3.1.3 works out for me. Also, if the text displays fine for me in KWord (even using the same fixed-width font), is it safe to assume the font is okay?
Created attachment 1973 [details] Snapshot of the Tibetan characters in KWrite Both in Konsole and in KWrite (which use fixed-width fonts), I see 14 single-spaced characters.
Beautiful! Thank you all, I will consider this issue fixed in CVS and will stop bugging you (no pun intended).