Bug 105538 - KHTMLPart misbehaves when rendering block UTF-8 R-T-L strings
Summary: KHTMLPart misbehaves when rendering block UTF-8 R-T-L strings
Status: RESOLVED WORKSFORME
Alias: None
Product: konqueror
Classification: Applications
Component: khtml (show other bugs)
Version: unspecified
Platform: openSUSE Linux
: NOR normal
Target Milestone: ---
Assignee: Konqueror Developers
URL:
Keywords: rtl
Depends on:
Blocks:
 
Reported: 2005-05-12 16:39 UTC by Abdalla Alothman
Modified: 2020-11-08 06:11 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
the rendering of the text on kubuntu 6.10 (32.19 KB, image/png)
2007-04-28 00:50 UTC, Diego Iastrubni
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Abdalla Alothman 2005-05-12 16:39:58 UTC
Version:            (using KDE KDE 3.4.0)
Installed from:    SuSE RPMs
OS:                Linux

When rendering right-to-left Unicode characters with
diacritical marks, both KHTMLPart and Konqueror fail to
display lines properly; If the last word in the line contains
a UTF-8 character with a diacritical mark, the character gets
displaced -- it is placed at the beginning of the line.

Example:

this is a test

becomes:

t this is a tes

This only happens when the line above is a UTF-8 RTL string and
the last character contains a diacritical mark.

NOTE: Since this bug has been experienced within Arabic lines
only, it is worth mentioning that an Arabic Unicode character
with a diacritical mark = character + diacritical mark

This problem did not exist before KDE 3.4 -- the rendering
was without any problems.

TIA.
Comment 1 Abdalla Alothman 2005-05-12 16:55:59 UTC
I forgot to mention the following test case:

When writing the RTL UTF-8 into the KHTMLPart,
the problem will remain unless the HTML refers
to a CSS stylesheet that would include:

direction:rtl

So instead of relying on the KHTMLPart to take
care of the text flow problem, the problem becomes
a mixed responsibility between the KHTMLPart and
the CSS stylesheet.

Comment 2 Philip Rodrigues 2006-09-02 18:23:13 UTC
Can you attach a testcase that shows the problem please?
Comment 3 Abdalla Alothman 2006-09-03 04:20:35 UTC
Philip,

That was a long time ago.  I was just beginning to use the KDE API libraries,
and KDE in general. It appears handling RTL characters is totally different
from Gnome. I will give you the following text:

إن هذا لمكر مكرتموه
If you can't see the text properly, you need to set the Encoding of this
message to UTF-8 in your Email client.

You should test the following text in

1. Konqueror:
  [a] by just surrounding it with <p></p> tags:
  <p>إن هذا لمكر مكرتموه</p>
  [b] by adding <p style="direction:rtl;">إن هذا لمكر مكرتموه</p> (this is the way I fix it).

2. Kate or Kwrite: It becomes a real mess, if you try to move the cursor on the
    text. A challenge: try to copy the text with the tags in kate, and try deleting
    the first occurrence of م (the eighth chracter in the string) in a natural way.

3. Konsole. real mess. You have to set BIDI in Konsole if you wish to do some
testing... Testing ideas in Konsole:

1. Paste the text on the command line, and just observe.
2. Hit enter.
3. Either refresh the window or start a new tab, go to it, and the
come back to the first tab where you pressed enter.


4. To see how it should really appear, paste it in "gedit," the Gnome editor.

Now, try to add diacritical marks onto the sample text:

إِنً هَذَاَ لَمْكرٍ مَكْرتموُهُ

and see if you get any changes. I do get changes. The application I worked on
at that time was a KHTML part that received text from PostgreSQL. Everything
is fine if I add the style=direction:rtl;

It shouldn't really misbehave when adding diacritical marks because a
diacritical mark is nothing but an additional character that the user
adds after the letter. It is totally different from European characters
with diacritical marks which are usually typed in a single key stroke.

The interesting part is that we don't find this problem in a textfield
(lineedit) or a ComboBox's LineEdit.

Since there seems to be a problem with RTL not just in KHTMLPart, I
assume this problem is in KDE in general.

Thank you,
Abdalla Alothman

On Saturday 02 September 2006 19:23, Philip Rodrigues wrote:
[bugs.kde.org quoted mail]
Comment 4 Philip Rodrigues 2006-09-03 22:10:16 UTC
Thanks for the details Abdalla. I'll have to leave this to someone familiar with RTL text handling
Comment 5 Martin Fitzpatrick 2007-01-07 20:45:30 UTC
I can confirm (KDE 3.5.5 / Kubuntu 6.10) the weird behaviour.

Copying the text given ( <p style="direction:rtl;">إن هذا لمكر مكرتموه</p> ) by submitter and then attempting to delete any character from the rtl text results in different characters being deleted.  For example, if you delete the 2nd character from the left the 2nd from the right is removed instead.

Using cursor keys in Konqueror text box to move through text it jumps to the right hand side of the block of rtl text, with right-cursor moving left through the text until the end (left hand side) after which it jumps to the end and continues.  Difficult to decide if this is intended behaviour: in Kate the cursors move correctly through the text.
Comment 6 Diego Iastrubni 2007-04-28 00:49:17 UTC
Kate does not support BIDI in the KDE3 branch. In KDE4 I see an improove, but I would not speak about it yet.

We (the arabic speaking developer/user and the hebrew speaking developers/users) disagree about the usage of bidi in console. Lets ignore this as well.

I am probably missing something, but this does look ok on my system (kde 3.5.5 from Kubuntu 6.10). 

Abdalla, I am attaching a screen shot of my system. If this does look ok for you (my arabic really stinks, sorry...) we can close this bug.
Comment 7 Diego Iastrubni 2007-04-28 00:50:21 UTC
Created attachment 20433 [details]
the rendering of the text on kubuntu 6.10
Comment 8 Yaron Shahrabani 2020-11-08 06:11:40 UTC
Falkon handles RTL pretty well in KDE 5.20.2, works for me.