Summary: | Unicode decomposed text gets garbled in Konsole (NFD mode) | ||
---|---|---|---|
Product: | [Applications] konsole | Reporter: | Thiago Macieira <thiago> |
Component: | general | Assignee: | Konsole Developer <konsole-devel> |
Status: | RESOLVED FIXED | ||
Severity: | minor | CC: | aacid, ach, albbas, chrislb, dahalaishraj, ellingsw+20759, jnelson-kde, kde, maarizwan, mfabian, ott, praveen, sieburgh, yann |
Priority: | NOR | ||
Version: | 1.5 | ||
Target Milestone: | --- | ||
Platform: | unspecified | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | 4.8.0 | |
Sentry Crash Report: | |||
Attachments: |
WorkInProgress patch
RB patch w/o const changes and whitespaces changes. |
Description
Thiago Macieira
2005-01-07 17:26:46 UTC
Konsole's internal representation stores one character per screen position and has no room to store the NFD form, for that reason it tries to convert to NFC before storing the characters. When you copy&paste you will get the stored characters. Which is wrong, because it's the wrong representation. Also, it makes some glyphs unreadable, because it will discard some combinations when doing NFC. It's also lacking in the sense that d + acute does not render as ḋ in Konsole. Changing priority. I have a couple of ideas I may try in the future for KDE 4. *** Bug 104691 has been marked as a duplicate of this bug. *** *** Bug 221508 has been marked as a duplicate of this bug. *** Well it's the future and this problem doesn't appear to be fixed. I'm running KDE 4.3.1 and decomposed characters with diacritical marks are only represented as the base character. For example, in my case ë (that's the decomposed form of e with a umlaut) is just displayed as e. Please fix! gnome-terminal and xterm both display NFD unicode properly. Due to konsole's inability to display NFD characters I chased a particular bug around in other software for HOURS. Here is an easy way to test (needs python): #! /usr/bin/python import unicodedata u = u'Ha\u0308mikon' u1 = unicodedata.normalize('NFC', u) u2 = unicodedata.normalize('NFD', u) u3 = unicodedata.normalize('NFKD', u) u4 = unicodedata.normalize('NFKC', u) print u1, u2, u3, u4 The above should print what appears to be the *same* word 4 times. konsole is the only console tested that does not work. I'm running KDE 4.4.3. Now on KDE 4.5.0 This is not a "NEW" bug. It's been around for *5 years* and konsole *still* can't display unicode characters properly! Is anybody working on a fix for this? I certainly wouldn't consider this a minor issue. I'm having a look, but i'm a complete newbie in konsole codebase so can't promise anything Created attachment 60067 [details]
WorkInProgress patch
This is a work in progress patch, with it i can show files that contain e + comgining ring, if anyone is bored and gives it a try i'd be happy to hear the experiences.
Full working patch (as far as my testing goes) at https://git.reviewboard.kde.org/r/101721/ I would really like people testing this since as far as i know it works perfectly Created attachment 61327 [details]
RB patch w/o const changes and whitespaces changes.
I just removed the const and whitespace changes to get a cleaner diff.
This patch is big and I'm not familiar w/ all the code.
It does appear to fix the given issue.
Kurt, I'm pretty confident about the code (though has a limitation of only being able to show 65534 different composed characters at a time (a reasonable limitation if you ask me and a sure improvement from not working :D)) Since it is a quite "big-ish" patch and you don't seem confortable with the code i propose we commit it to master (that will be KDE 4.8) so we have time to fix stuff if it breaks before the next release, what you say? Yes, I should have mentioned that I had planned to commit to master so people could test it. Any reason you want to do it yourself? I mean it makes more sense if i do it so people can correctly find who to blame from the log :D Go ahead although I'd prefer if you'd split/pdh the patch into the const, whitespace and the real patch. Commited to master *** Bug 255862 has been marked as a duplicate of this bug. *** *** Bug 276301 has been marked as a duplicate of this bug. *** *** Bug 279978 has been marked as a duplicate of this bug. *** *** Bug 217684 has been marked as a duplicate of this bug. *** *** Bug 226024 has been marked as a duplicate of this bug. *** *** Bug 149777 has been marked as a duplicate of this bug. *** *** Bug 116251 has been marked as a duplicate of this bug. *** *** Bug 156071 has been marked as a duplicate of this bug. *** Albert, can you double-check that this patch causes a 'cat tests/9x15.repertoire-utf8' to crash konsole? Fixed that crash. I had an incorrect assumption. Removed my votes for this bug as I quit using KDE a while ago due to its bloat. This problem still exists in Konsole 20.08.3. Running the python program below printing the string Hämikon illustrates the problem. The program below? where? (In reply to Albert Astals Cid from comment #30) > The program below? where? Sorry for being unclear, I'm talking about the script in comment 7. Looks good to me https://i.imgur.com/ZQZU3xm.png Anything i'm missing? The first is Konsole, the second is gnome-terminal. xterm and uxterm have the same result as gnome-terminal https://imgur.com/a/6SxxG3j System info: Operating System: KDE neon 5.20 KDE Plasma Version: 5.20.3 KDE Frameworks Version: 5.76.0 Qt Version: 5.15.1 Kernel Version: 5.4.0-54-generic OS Type: 64-bit Processors: 4 × Intel® Core™ i7-5600U CPU @ 2.60GHz Memory: 15.5 GiB of RAM Graphics Processor: Mesa Intel® HD Graphics 5500 Locale and such on my machine: ❯ echo $LC_ALL $LANGUAGE $LANG se_NO.UTF-8 se:nb:en_US nb_NO.UTF-8 Running the script this way: LC_ALL=C python3 bla.py gave the same result. I don't see the problem. Or is the problem "the font I am using in konsole is different to the one i'm using in gnome-terminal and the ä looks weird"? It has to do with fonts, yes. I found a font without the problem in Konsole, and tested with the same font in gnome-terminal, both show the expected result. I use Hack in konsole, so I tested Hack on gnome-terminal. gnome-terminal is not affected, while konsole is. gnome-terminal on the left, konsole on the right. https://imgur.com/a/ovQA7z2 I'm going to say there's a bug either in Hack or in Qt, not much we can do if with other fonts works fine. |