Bug 47171 - Mixed Hebrew & Lattin text in the same paragraph - no user control over ordering
Summary: Mixed Hebrew & Lattin text in the same paragraph - no user control over ordering
Alias: None
Product: kword
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Platform: RedHat RPMs Linux
: NOR wishlist with 99 votes (vote)
Target Milestone: ---
Assignee: Thomas Zander
Depends on:
Reported: 2002-08-29 14:48 UTC by Levy, Chen
Modified: 2007-07-28 13:56 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:


Note You need to log in before you can comment on or make changes to this bug.
Description Bugzilla Maintainers 2002-08-29 14:41:04 UTC
(*** This bug was imported into bugs.kde.org ***)

Package:           kword
Version:           1.2rc1 (using KDE 3.0.3 )
Severity:          wishlist
Installed from:    RedHat RPMs
Compiler:          gcc-2.96
OS:                Linux
OS/Compiler notes: RedHat 7.3 KDE 3.0.3 (official RPMS) KOffice 1.2rc1 (from RPM)

To denote Hebrew text I will use 2 Hebrew words "SHALOM OLAM" which meens "hello world".
When indicating text entry I will use the Latin ordering i.e SHALOM OLAM.
When indicating text display I will use the Hebrew (visual) ordering i.e. MALO MOLAHS.

General Description:
The paragraphs language follows the general paradigm of paragraph order i.e. the first non-neutral character determines the paragraph order and there is no way for the user to override this behavior.
This Itroduces the following problems:

How to Reproduce:
Using Israeli keyboard layout and Hebrew fonts (iso8859-8-i) with kword:

The simple case (no punctuation):
so writing:  hello world SHALOM OLAM
produces:    hello world MALO MOLAHS

and writing: SHALOM OLAM hello world
produces:    hello world MALO MOLAHS  
as well (only the default justification is different).
This problem is compounded if we use punctuations:  
Writing:     hello world! SHALOM OLAM!
produces:    hello world! MALO MOLAHS!
(note that the only thing wrong with this picture is the right-most '!' sign.)

Writing:     SHALOM OLAM! hello world!
Produces:    !hello world !MALO MOLAHS
(note that now the only thing wrong with this picture is the left-most '!' sign.)
Justifying the text to the right or left does not alter this text ordering.

This behaviored seems perfectly good for a simple text editor will just won't do for a full fledged word-processor.

Expected Behavior:
Let's introduce a new notation:
<H>   denotes an invisible RTL character.
<E>   denotes an invisible LTR character.

The simple case (no punctuation):
Writing:     <H>hello world SHALOM OLAM"
Produces:    MALO MOLAHS hello world

Writing:     <E>SHALOM OLAM hello world
Produces:    MALO MOLAHS hello world

The complex case (with punctuation):
The neutral characters following LTR character should be LTR
The neutral characters following RTL character should be RTL
with no regard to the paragraph ordering and so:

Writing:     hello world! SHALOM OLAM!
or writing:  SHALOM OLAM! hello world!
Produces:    hello world! !MALO MOLAHS
(note that the default justification will be different)

Writing:     hello world! SHALOM OLAM<E>!
Produces:    hello world! MALO MOLAHS!

Writeing:    SHALOM OLAM! hello world<H>!
Produces:    !hello world !MALO MOLAHS

The user should have the following controls (via keyboard and toolbar) Insert <E>/<H> at beginning of line. (see simple case)
Insert <E>/<H> at current position. (see complex case)

This mode of operation will be known to the Israeli user because MS-Word works simularly.

Related bugs:
#41829  bidi bullets and numbering placement error
#42556  incorrect justification of enumerated/bullet lists in Hebrew
#46018  incorrect hebrew layout of numbered or bulleted lists in kword

(Submitted via bugs.kde.org)
Comment 1 Levy, Chen 2004-05-11 13:09:18 UTC
This bug (wishlist) is still current in KOffice 1.3.1 (on KDE 3.2.2)

Some incite can be gleend from the discussion on the OpenOffice.org's issuezilla, namely on:

[Issue 18024]  Direction of weak characters: A new method for dealing with text direction without using keyboard layout


[Issue 27174]  Immitating eLaTex L2R/R2L parenthesis, to bound L2R text

It is worth to note that OpenOffice.org deals better then KOffice with right-to-left languages such as Arabic and Hebrew, thanks to a UI control that let's the user select the paragraph directionality (R2L/L2R). Note also, that while solving most of the BiDi problems, it does not solve all of them, hence the above issues.

(This note is a responce for the "Please review your entries at bugs.kde.org to help us" notification, I got by e-mail)
Comment 2 Behdad Esfahbod 2004-06-05 03:25:09 UTC
Well, such invisible LTR and RTL characters do exist in Unicode as U+200E LEFT-TO-RIGHT MARK and U+200F RIGHT-TO-LEFT MARK.  So you need a mechanism to insert them.  In windows and GTK+, you can select them from the popup menu.
Comment 3 Levy, Chen 2007-04-30 14:34:31 UTC
This bug is nominally solved.

Using the Hebrew keyboard with the LyX variant:
* Shift+Tet (the Hebrew character that on a QWERTY board resides with the Y glyph) is mapped to the RLM Unicode code point
* Shift+Aleph (on QWERTY with T) is mapped to the LRM Unicode code point.

So nominally, the user has all the controll needed over directionality.

However, this implementation is far from perfect:
* This knowledge is obscure. Virtually no one knows of it, and it is non discoverable by the user. Even a determind user will not learn of this feature by simply poking in the UI.
* It is not convenient once you know of it. In order to insert and LRM/RLM character you need to first go to Hebrew mode in the keyboard and only then enter the LRM/RLM mark and then switch to the language user wish to write in.
* Once the RLM/LRM reside in the document, the user must remember where it is located. There is no way to make these character visible, so the user will be able to manipulate them easily.

Proposed changes to fix these points:
* Add the menu entries: `Insert -> Left-Right-Mark` and `Insert -> Right-Left-Mark`
* Put a toolbar buttons [<] and [>] that do the same as the menu entries above. (perhaps in a BiDi specific toolbar)
* Assign a keybiding to the RLM and LRM that is not dependent on the Hebrew LyX keyboard layout, but is provided by kword (or koffice). Use key strokes that will not change from Hebrew layout to English layout (e.g. `Ctrl+[` and `Ctrl+]` or `Left-Shift+Ctrl` and `Right-Shift+Ctrl`). Please check that the selected mappings are also available for other Left-to-right languages (Arabic and Farsi)
* When Invoking the `View -> Formatting Characters` menu entry the RLM/LRM characters should be considered as formatting characters.
* Add the a `Formatting Characters` bottom to the same toolbar as [<] and [>].
* Document this and the help system / Startup tip.
Comment 4 Thomas Zander 2007-07-28 13:56:57 UTC
Fixed in 2.0