Bug 405272 - KTextEditor: column count wrong for UTF32
Summary: KTextEditor: column count wrong for UTF32
Status: RESOLVED NOT A BUG
Alias: None
Product: frameworks-ktexteditor
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: unspecified
Platform: Other All
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-09 17:38 UTC by RJVB
Modified: 2019-03-29 10:12 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
Sample document (1.12 KB, text/plain)
2019-03-09 17:38 UTC, RJVB
Details

Note You need to log in before you can comment on or make changes to this bug.
Description RJVB 2019-03-09 17:38:07 UTC
Created attachment 118671 [details]
Sample document

SUMMARY
Certain emojis have a UTF32 representation and when these appear in text they are counted as 2 characters, leading to invalid column counts.

STEPS TO REPRODUCE
1. Open the attached document in kwrite or kate, advance the cursor over individual lines

OBSERVED RESULT
The column counter in the status bar jumps from 1 to 3 advancing the cursor 1 character to the right from the beginning of lines that start with a UTF32 glyph

EXPECTED RESULT
Such glyphs show as a single glyph and the cursor behaves that way too (= a single left/right press is sufficient to move over them). The counter should behave the same way.

Tested on Mac OS X but I have no reason to suspect this would be any different on other platforms.
Comment 1 Dominik Haumann 2019-03-10 07:25:39 UTC
How does Qt Creator behave here?
Comment 2 RJVB 2019-03-10 08:54:27 UTC
>How does Qt Creator behave here?

Hah, the same.  And re-hah, even less 458 recognises the glyphs as a single 32bit entity...

Are you implying this is a Qt bug? I downloaded the kate appimage thinking it would have a more recent Qt version than the one I use (5.9.7) but it's actually 2 versions older so behaves the same.
Comment 3 Christoph Feck 2019-03-24 11:29:14 UTC
> Are you implying this is a Qt bug?

No. Unicode code points that are not in BMP need two UTF-16 symbols, and when counting characters, code needs to account for these surrogate pairs.

Even more correct would be to use something like "QFontMetrics.width() / characterWidth" to account for combining diacritics or other code points that get merged into a single cell.
Comment 4 Dominik Haumann 2019-03-24 19:01:07 UTC
No, in fact, I was asking since if Qt behaves the same, maybe there is no bug at all?
Comment 5 RJVB 2019-03-24 19:14:27 UTC
Dominik Haumann wrote on 20190324::19:01:07 re: "[frameworks-ktexteditor] [Bug 405272] KTextEditor: column count wrong for UTF32"

> maybe there is no bug at all?

Well, it's understandable that it happens, but does that mean it should happen? If not ... doesn't that make it a bug?
Comment 6 Dominik Haumann 2019-03-25 19:11:15 UTC
This is functions as designed. We even have API for this:

1. bool KTextEditor::Document::isValidTextPosition(Cursor) const
https://api.kde.org/frameworks/ktexteditor/html/classKTextEditor_1_1Document.html#a31201a07310caab1246886d06e9b8559

2. bool KTextEditor::DocumentCursor::isValidTextPosition() const
https://api.kde.org/frameworks/ktexteditor/html/classKTextEditor_1_1DocumentCursor.html#adc1e646dd9432cbf7aff21d456550ca8

There is no bug. And it's also correct that this is counted as multiple characters. With input methods, these character probably even can be edited.

If you really think there is an issue, then please discuss with Qt developers on the Qt developer mailing lists. Kate won't change.
Comment 7 RJVB 2019-03-29 10:12:08 UTC
Dominik Haumann wrote on 20190325::19:11:15 re: "[frameworks-ktexteditor] [Bug 405272] KTextEditor: column count wrong for UTF32"

>There is no bug. And it's also correct that this is counted as multiple
>characters. With input methods, these character probably even can be edited.
>
>If you really think there is an issue, then please discuss with Qt developers
>on the Qt developer mailing lists. Kate won't change.

https://bugreports.qt.io/browse/QTBUG-74725?focusedCommentId=453873&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-453873