Bug 83236 - Deleting Arabic text in terminals in response to an interactive command using BACKSPACE will delete the question itself
Summary: Deleting Arabic text in terminals in response to an interactive command using...
Status: RESOLVED FIXED
Alias: None
Product: konsole
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Unlisted Binaries Linux
: NOR normal
Target Milestone: ---
Assignee: Konsole Developer
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-12 00:42 UTC by Munzir Taha
Modified: 2008-05-06 16:34 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Munzir Taha 2004-06-12 00:42:33 UTC
Version:            (using KDE KDE 3.2.3)
Installed from:    Unspecified Linux
OS:                Linux

When you type any interactive command in the console (e.g. rm), you will be faced with  
a question such as: 
 
rm: remove regular file `filename'?  
 
Now, if you switch to Arabic and type some Arabic letters then BACKSPACE to delete  
them, you will delete letters from the question itself which is equal to the number of the  
typed Arabic letters. Looks as if something concerned with Arabic being double byte and  
the UTF-8 encoding.
Comment 1 Waldo Bastian 2004-06-13 15:39:44 UTC
Does the arabic run from right to left? That's not handled very well at the moment.
Comment 2 Munzir Taha 2004-06-13 23:29:30 UTC
Yes, it runs from right to left. This problem is also in konsole, xterm, mlterm, gnome-terminal.

Pablo Saratxaga of Mandrake on http://qa.mandrakesoft.com/show_bug.cgi?id=5645
mentioned: "It is probably tied to libreadline (it isn't related to terminals)"

Can any one confirm? Where do we need to file this bug then?
Comment 4 Munzir Taha 2004-06-20 12:36:18 UTC
I received a response from Mr. Chet Ramey, the maintainer of the GNU Readline library.
He told me:
"The message is coming from `rm'.  I don't have anything to do with the
distributions in question, or the `rm' source.  It seems to me that the
burden of proving that readline is at fault is the distributor -- maybe
by demonstrating that `rm' is linked with libreadline and that it is
calling readline() when it is getting its response from the terminal.
Simply asserting that readline is responsible is insufficient."

Please change the status of this bug to REOPENED
Comment 5 Waldo Bastian 2004-06-23 12:29:48 UTC
KDE does not make distributions and rm is not a KDE applications so there is little point in reopening this bugreport here.
Comment 6 Munzir Taha Obeid 2004-06-23 17:17:11 UTC
Mr. Waldo, it's not a matter of the rm command only. It's a matter of rm, cp, mv, ... and all commands that ask an interactive questions on konsole. This is why I believe it's konsole bug not a command bug, am I still wrong?
Comment 7 Ken Deeter 2004-06-23 18:10:12 UTC
Could this be related to wcwidth again? Although I don't know how arabic terminals are supposed to work, but the symptoms described in the original report remind me of the double-width/single-width cjk bug.

Does it happen also if say, at a prompt you type some characters, and then delete them.. does the prompt string get partially deleted as well?
Comment 8 Waldo Bastian 2004-06-23 18:14:01 UTC
It's unlikely to be a konsole bug if the problem also happens with xterm.
Comment 9 Ken Deeter 2004-06-23 18:51:53 UTC
oh right, didn't see that, sorry
Comment 10 Munzir Taha 2004-06-24 11:10:11 UTC
>Does it happen also if say, at a prompt you type some characters, and then
>delete them.. does the prompt string get partially deleted as well?
No.
 
I also want to add that launching konsole without without utf support:
LC_ALL=en_US.ISO-8859-1 konsole
won't produce this bug. Dosn't this mean the bug is in konsole handling of unicode characters? Can't konsole and xterm have the same bug though?
Comment 11 Munzir Taha 2004-06-24 16:27:24 UTC
Another example if I want to install a package with urpmi and it has a dependancy, I will be asked

Is this OK? (Y/n)
If I typed Arabic text and pressed BACKSPACE part or all of the question will be erased. I hope you can, at least, guide me to which lib this bug belongs.
Comment 12 Munzir Taha 2004-06-24 21:53:53 UTC
>Does it happen also if say, at a prompt you type some characters, and then 
>delete them.. does the prompt string get partially deleted as well?
Yes! it happens sometimes in konsole (but not xterm ;) )
That's if I kept pressing any English letter (say k) while switching the keyboard layout to Arabic (using Alt+Shift for example) and English again many times. The prompt will be deleted. Actually the cursor goes to the left then up in a very strange way. I meant to say do something like this:

$kkkkkkنننننننننننننkkkkkkkkkkنننننننننننننkkkkkkkkk
and see this easter-egg in konsole only today ;)
Comment 13 Munzir Taha 2004-06-24 22:07:16 UTC
I will open it again since it's now a konsole-specific bug.
Comment 14 Munzir Taha 2004-06-25 04:01:59 UTC
Another way to reproduce this strange bug is:
- Open konsole
- Press Enter many times (just to make things look clear)
- Switch to Arabic
- Keep pressing any Arabic letter (say the one on the J key), press Alt and release it.
- Don't forget to still keep pressing the Arabic letter until it reaches the edge of the screen.

You will see how strangely the cursor moves in a wild manner.


Comment 15 Ken Deeter 2004-06-26 09:11:54 UTC
Hi, I don't understand Arabic at all, but I was able to reproduce your behaviour..

Waldo, maybe you can try this:

* Enable arabic keyboard in control center
* run a konsole in en_US.UTF-8 locale
* set locale to en_US.UTF-8 in case your shell overrides it
* turn on arabic keyboard
* start hitting "alt+j".. actually alt and any key.

If you keep hitting that combination, the cursor starts jumping backwards.

What I don't know is.. is the alt+key combination mean something specific for arabic input? If you set it on the english keyboard, it doesn't do anything.
Comment 16 Munzir Taha 2004-06-26 18:26:30 UTC
no, Alt+ArabicKey doesn't mean any thing specific for Arabic input unless it's defined in xorg keymap to mean something which is not the case.
Comment 17 Ken Deeter 2004-06-26 21:20:48 UTC
> ------- Additional Comments From munzirtaha newhorizons com sa 
> 2004-06-26 18:26 ------- no, Alt+ArabicKey doesn't mean any thing
> specific for Arabic input unless it's defined in xorg keymap to mean
> something which is not the case.
> 

Well just to make sure, could you set it on english keyboard and type
alt+j and verify that nothing happens

Comment 18 Munzir Taha 2004-06-26 21:45:12 UTC
Ken, first thanks to the way you reproduced the bug. I just checked it now. It's easier and more direct than my steps.

Second, you are right of course regarding alt+j in an english keyboard won't cause any probem.
Comment 19 Egmont Koblinger 2004-11-12 19:28:33 UTC
I guess this has nothing to do with arabic. I could even reproduce it with
Hungarian which is a Latin left-to-write language.

The bug arises in cooked tty mode whereas the kernel(?) has to be told that
UTF-8 is used so that the terminal driver knows to remove more bytes from the
buffer when a backspace is pressed.

How to reproduce No1:
bash$ cat
type for example xyzáé, press one or two backspaces and then Enter. The line
echoed back clearly shows that each backspace removes only one byte and not
a whole multibyte character.

How to reproduce No2:
bash$ echo -n 'someprompt>' ; cat
now type some non-ascii characters (accented letters, euro sign etc.) and then
press backspace many times. Backspace erases one character as many times as
many bytes (not characters!) were typed at the prompt.

AFAIK the kernel maintains a bit for each terminal telling whether that one
is in UTF-8 mode. The problem is that konsole doesn't set this bit.

There's a patch to stty here:
ftp://ftp.ilog.fr/pub/Users/haible/utf8/
and it's also there inside SUSE 9.2 coreutils source rpm.

If you apply this patch then "stty -a" shows "-iutf8" which means input utf8
mode is turned off. Type "stty iutf8" to turn it on. The bugs discussed above
will disappear then.

Please check http://www.cl.cam.ac.uk/~mgk25/unicode.html (Markus Kunh's famous
UTF-8 FAQ), read the paragraph from "Any Unix-style kernel"... It says only
Linux kernels >= 2.6 have this iutf8 feature, so there may be problems with
other kernels and other Unix systems. So IMHO what konsole should do is that
#ifdef IUTF8 then try to perform the relevant tcsetattr() (either try to
explicitely set or explicitely clear this flag since no-one knows when the
kernel's default will change to utf-8 mode). Also please note that this iutf8
flag should also be changed if utf-8 mode is changed run-time in konsole
(e.g. start an 8-bit terminal and then print \e%G and \e%@).
Comment 20 Egmont Koblinger 2004-11-12 19:43:28 UTC
Or, another possible workaround, if you cannot set the terminal to utf-8 mode
is to make an attempt to set it to non-utf8 mode, and then workaround all these
stuffs assuming non-utf8 mode. That is, when I press backspace, only move one
character to the left on the screen, but pretend to the real terminal driver
that backspace was hit more times. This solution, however, I'm afraid might
lead to unpredictable problems over remote ssh or telnet connections, so I'd
not recommend this for systems that natively support utf8 input in their tty
drivers, such as linux 2.6.
Comment 21 Egmont Koblinger 2004-11-12 19:44:25 UTC
xterm suffers from the same problem. I've reported this to the x.org folks too:
http://freedesktop.org/bugzilla/show_bug.cgi?id=1841
Comment 22 Waldo Bastian 2005-01-23 18:30:52 UTC
CVS commit by waba: 

Add support for IUTF8
BUG: 83236


  M +5 -0      TEPty.cpp   1.88
  M +1 -0      TEPty.h   1.29
  M +3 -5      TEmulation.cpp   1.61
  M +3 -0      TEmulation.h   1.35
  M +13 -5     konsole.cpp   1.502
  M +2 -0      session.cpp   1.100


--- kdebase/konsole/konsole/TEPty.cpp  #1.87:1.88
@@ -94,4 +94,9 @@ void TEPty::setXonXoff(bool on)
 }
 
+void TEPty::useUtf8(bool on)
+{
+  pty()->setUtf8Mode(on);
+}
+
 /*!
     start the client program.

--- kdebase/konsole/konsole/TEPty.h  #1.28:1.29
@@ -49,4 +49,5 @@ Q_OBJECT
 
   public slots:
+    void useUtf8(bool on);
     void lockPty(bool lock);
     void send_bytes(const char* s, int len);

--- kdebase/konsole/konsole/TEmulation.cpp  #1.60:1.61
@@ -206,13 +206,11 @@ void TEmulation::setCodec(const QTextCod
   delete decoder;
   decoder = m_codec->makeDecoder();
+  emit useUtf8(utf8());
 }
 
 void TEmulation::setCodec(int c)
 {
-  //FIXME: check whether we have to free m_codec
-  m_codec = c ? QTextCodec::codecForName("utf8")
-            : QTextCodec::codecForLocale();
-  delete decoder;
-  decoder = m_codec->makeDecoder();
+  setCodec(c ? QTextCodec::codecForName("utf8")
+           : QTextCodec::codecForLocale());
 }
 

--- kdebase/konsole/konsole/TEmulation.h  #1.34:1.35
@@ -65,4 +65,5 @@ signals:
 
   void lockPty(bool);
+  void useUtf8(bool);
   void sndBlock(const char* txt,int len);
   void ImageSizeChanged(int lines, int columns);
@@ -84,4 +85,6 @@ public:
   bool isConnected() { return connected; }
 
+  bool utf8() { return m_codec->mibEnum() == 106; }
+
   virtual void setListenToKeyPress(bool l);
   void setColumns(int columns);

--- kdebase/konsole/konsole/konsole.cpp  #1.501:1.502
@@ -843,7 +843,14 @@ void Konsole::slotSetEncoding()
   if (!se) return;
 
+  QTextCodec * qtc;
+  if (selectSetEncoding->currentItem() == 0)
+  {
+    qtc = QTextCodec::codecForLocale();
+  }
+  else
+  {
   bool found;
   QString enc = KGlobal::charsets()->encodingForName(selectSetEncoding->currentText());
-  QTextCodec * qtc = KGlobal::charsets()->codecForName(enc, found);
+    qtc = KGlobal::charsets()->codecForName(enc, found);
   if(!found)
   {
@@ -851,4 +858,5 @@ void Konsole::slotSetEncoding()
     qtc = QTextCodec::codecForLocale();
   }
+  }
 
   se->setEncodingNo(selectSetEncoding->currentItem());

--- kdebase/konsole/konsole/session.cpp  #1.99:1.100
@@ -71,4 +71,5 @@ TESession::TESession(TEWidget* _te, cons
   //kdDebug(1211)<<"TESession ctor() sh->setSize()"<<endl;
   sh->setSize(te->Lines(),te->Columns()); // not absolutely nessesary
+  sh->useUtf8(em->utf8());
   //kdDebug(1211)<<"TESession ctor() connecting"<<endl;
   connect( sh,SIGNAL(block_in(const char*,int)),this,SLOT(onRcvBlock(const char*,int)) );
@@ -76,4 +77,5 @@ TESession::TESession(TEWidget* _te, cons
   connect( em,SIGNAL(sndBlock(const char*,int)),sh,SLOT(send_bytes(const char*,int)) );
   connect( em,SIGNAL(lockPty(bool)),sh,SLOT(lockPty(bool)) );
+  connect( em,SIGNAL(useUtf8(bool)),sh,SLOT(useUtf8(bool)) );
 
   connect( em, SIGNAL( changeTitle( int, const QString & ) ),


Comment 23 Munzir Taha 2005-01-24 17:44:07 UTC
Waldo Bastian,
Lots of thanks and keep up the good work!
I am Dreaming of the  day that all the bugs at http://wiki.arabeyes.org/OpenBugs
are fixed.
Comment 24 Munzir Taha 2005-01-24 17:47:25 UTC
especially that GNOME bugs seems to be far less, would this mean something? ;)