Version: (using KDE KDE 3.2.0) Installed from: RedHat RPMs OS: Linux I just upgraded to KDE 3.2 and thought I'd re-test the Unicode support in Konsole. It looks like Konsole cannot display characters from the following Unicode ranges, even if the system supports them: 0x0d80-0xdff (Sinhala) 0x10000-0x1007f (Linear B Syllabary) 0xf0000-0xffffd (Supplementary Private Use Area A) It looks like 0x0600-0x06ff (Arabic) may also be broken, but I'm not sure since I don't know the language and it IS displaying SOMETHING. It doesn't look right though. My testing was not exhaustive, so there may be more bad ranges. The following free TrueType fonts support these scripts: Free Sans supports Sinhala and Arabic (http://savannah.nongnu.org/projects/freefont/) Penuturesu supports Linear B Syllanary (http://www.i18nguy.com/unicode/unicode-font.html) Code2001 uses the Supplementary PUA for Tengwar and Cirth support (http://home.att.net/~jameskass/code2001.htm) On a positive note, Braille, Katakana, Tibetan, Syriac, and Runic all seem to work quite well. I will attach some text files (UTF-8 encoded) along with some images of how I would expect them to look (more or less), using Mozilla as the reference.
Created attachment 4517 [details] Collection of Unicode test files For each script, there is an HTML file, a UTF-8 encoded TXT file, and a reference PNG image. The PNG images were created by pointing Mozilla at the HTML files, and the HTML files are nothing more than the TXT files with some extra code to tell the browser that it's UTF-8.
Okay, last comment for now: that's a ZIP archive I attached.
This might be Qt related. Do these files work with things like kate or kedit?
Bash 2.05b release has 7 pending patches some of them for wide character input/output. Is your bash up to date?
Dang, mid-air collision! Kate and Kedit also fail. There's also a Konqueror bug filed here: http://bugs.kde.org/show_bug.cgi?id=77348. Could very well be a common Qt bug. Lack of support for the Supplemental Private Use Area could indicate that Qt is limited to a two-byte datatype for wide characters, or it could just be a bug like the others. Konsole/KWrite tend to function a little differently than Konqui because they force everything into monospace, so I filed separate bugs. I'm using Bash 2.05b.0(1)-release (i386-redhat-linux-gnu).
Qt does not support characters outside the Basic Multilingual Plane. That means you're restricted to U+0000 to U+FFFF.
Agreed, lack of support for anything above U+FFFF is a Qt bug. I've already filed a bug with Trolltech to make QChar support 32 bits of data instead of 16. This bug still applies to konsole for Sinhala and Arabic, however. Could also be Qt's font substitution bug http://bugs.kde.org/show_bug.cgi?id=47682 but I doubt it because it still fails when you set Konsole's font to a font which contains these characters.
QString can support 21-bit Unicode chars via UTF-16 surrogate pairs. The problem with that is that QChar won't be able to handle single codepoints. And QString-to-UTF32 conversion will be more difficult.
Hello, Display of unicode characters has improved quite a bit in Konsole for KDE 4, mainly because a major bug in character conversion was inadvertantly fixed :) Unfortunately I cannot test the Sinhala test text file because I cannot get Sinhalese to display correctly in any Qt application. Advice on how to do this would be helpful.
All I know is that the Free Sans font has the right glyphs. You can get it here: http://savannah.nongnu.org/projects/freefont/ Anything beyond that is beyond me!
Okay, now testing with KDE 4.0.0 (openSUSE 10.3). This is not a complete test of all Unicode ranges, just my little test ranges. These ranges display nothing: 0x0f00-0x0fff Tibetan (KDE3 does NOT have a problem with this) The following ranges appear to be fixed in KDE4: 0x10000-0x1007f (Linear B Syllabary) 0x0600-0x06ff (Arabic) Either these ranges display with occasional problems or there's a problem with my test files: 0xf0000-0xffffd (Supplementary Private Use Area A) 0x0d80-0xdff (Sinhala) I'll try to generate new test files to demonstrate the remaining problems. My previous test Arabic text file appears screwed up, so it's possible I've got bigger problems on my end.
Okay, my "occasional problems" were caused by fonts that were missing a glyph here and there. So the status AFAIK is that everything I've tested works great, with the notable exception of Tibetan, which appears to be a regression from KDE3. For testing purposes, a free font (utibetan.ttf) containing Tibetan glyphs can be downloaded here: http://www.wazu.jp/gallery/Fonts_Tibetan.html. Tibetan is also in the Arial Unicode (arialuni) font, included with Microsoft Office.
I'm using Konsole that ships with Kubuntu 9.04, and I can't get the Unicode combining diacritic "combining dot above" U+0307 to work. For example, if I go into Python, I get: luke@DELL-E1505:~/lib/python/sympy(master)$ python Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print(u'a\u0307') a >>> Which, at least on my computer is an a with no dot above it. Other diacrits work though: >>> print(u'a\u0352') a͒ >>> I'm trying to use the dot above some letters for some symbolic mathematics where things like dx/dt would be represented more compactly by using the Newton notation for the derivative, i.e., an x with a dot above it. I don't think it is an issue with the font I'm using, which is DejaVu Sans Mono, which is supposed to support this character. xterm prints this just fine on my computer, with the exact same commands.
Regarding comment 13 - see bug 96536. konsole (as of 4.3.3) does not support NFD (combined) character display.
Some box drawing symbols are drawn as nothing, even with different fonts. (I've only tried the monospace ones that konsole shows in the profile dialog.) The glyphs it has trouble with are "╭╮╯╰╱╲╳". Konsole Version 2.4.2. All other KDE applications are able to draw those symbols. Might be related, might be a different bug.
(In reply to comment #15) > Some box drawing symbols are drawn as nothing, even with different fonts. That problem would be bug #210329.
Created attachment 66388 [details] tibetan characters displayed in konsole (In reply to comment #12) > So the status AFAIK is that everything I've tested works great, with the > notable exception of Tibetan, which appears to be a regression from KDE3. I just tested konsole 2.8 using some snippet from http://www.alanwood.net/unicode/tibetan.html. The result seems fine to me, although I think some characters are calculated and displayed with wrong width. Maybe that width problem is related with bug 41744 and bug 186826.
Close it as FIXED in the sense that all specific problems mentioned in this report seems fixed now, not in the sense that Konsole now provides 100% unicode support.