Bug 87481 - Konsole freezes on startup when unable to allocate PTY
Summary: Konsole freezes on startup when unable to allocate PTY
Status: RESOLVED FIXED
Alias: None
Product: konsole
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Platform: Gentoo Packages Linux
: NOR normal (vote)
Target Milestone: ---
Assignee: Konsole Developer
URL:
Keywords:
: 99618 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-08-19 02:53 UTC by Adam
Modified: 2005-04-11 06:45 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Listing of pty devices (11.25 KB, text/plain)
2004-08-19 16:37 UTC, Adam
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Adam 2004-08-19 02:53:56 UTC
Version:            (using KDE KDE 3.2.3)
Installed from:    Gentoo Packages
Compiler:          gcc (GCC) 3.3.3 20040412 (Gentoo Linux 3.3.3-r6, ssp-3.3.2-2, pie-8.7.6) 
OS:                Linux

Sometimes, when I start Konsole, the window comes up, the background theme loads properly, the cursor appears, but my prompt doesn't appear.  Pushing keys has no effect.  The menubar at the top still works.

Note that this only happens sometimes - maybe about 25% of the time.

I also have my titlebar showing extra information by setting $PS1.  That information doesn't appear - the titlebar only says "Shell - Konsole".

This has only started recently.  It may have to do with me upgrading software (I use Gentoo and just update everything that has updates available).  My kdebase is now 3.2.3-r1 and was previously 3.2.3; I'm not exactly sure what the -r1 means but I instinctively guess the code is the same and it's the package that's different.  I also recently swiched from devfs to udev, although I have no clue why that would affect Konsole.
Comment 1 Waldo Bastian 2004-08-19 11:45:00 UTC
Konsole needs to allocate a pty device to communicate with the command shell, from the symptoms you describe it seems like that fails. Most probably related to udev indeed.
Comment 2 Adam 2004-08-19 16:37:54 UTC
Created attachment 7180 [details]
Listing of pty devices
Comment 3 Adam 2004-08-19 16:42:34 UTC
When using udev in Gentoo, it creates tons of devices in /dev, and there are 256 pty devices, which makes me wonder how it couldn't allocate one - or maybe the fact that there are so many is part of the problem.  I've attached the list of pty devices that Gentoo creates in case it's useful.  I disabled that functionality of Gentoo, and now there are no pty devices in /dev - and I haven't been able to reproduce the problem since.

I wonder - if the problem is really that it can't allocate a pty device, shouldn't it print an error message and exit rather than freeze?
Comment 4 Kurt Hindenburg 2004-08-19 17:35:32 UTC
Strange, I've been running Gentoo with udev for several months w/o any problems with konsole.  Note that I run KDE CVS, not the ebuilds.  I also don't know where you got that list of ptys... I wonder if we have different kernel versions/configs???

% ls /dev/
adsp@     hda7      loop0@  ram10@  sndstat@  tty20@  tty38@  tty55@   vcs2@
agpgart@  hdb       loop1@  ram11@  sound/    tty21@  tty39@  tty56@   vcs3@
audio@    hdb1      loop2@  ram12@  stderr@   tty22@  tty4@   tty57@   vcs4@
cdroms/   hdb2      loop3@  ram13@  stdin@    tty23@  tty40@  tty58@   vcs5@
console   hdc       loop4@  ram14@  stdout@   tty24@  tty41@  tty59@   vcs6@
core@     hdd       loop5@  ram15@  tty       tty25@  tty42@  tty6@    vcs7@
discs/    i2c/      loop6@  ram2@   tty0@     tty26@  tty43@  tty60@   vcsa@
dsp@      i2c-0@    loop7@  ram3@   tty1@     tty27@  tty44@  tty61@   vcsa1@
fd@       i2c-1@    lp0     ram4@   tty10@    tty28@  tty45@  tty62@   vcsa2@
fd0@      i2c-2@    mem     ram5@   tty11@    tty29@  tty46@  tty63@   vcsa3@
floppy/   i2c-3@    misc/   ram6@   tty12@    tty3@   tty47@  tty7@    vcsa4@
full      i2c-4@    mixer@  ram7@   tty13@    tty30@  tty48@  tty8@    vcsa5@
hda       ide/      null    ram8@   tty14@    tty31@  tty49@  tty9@    vcsa6@
hda1      initctl|  port    ram9@   tty15@    tty32@  tty5@   urandom  vcsa7@
hda2      input/    psaux@  random  tty16@    tty33@  tty50@  v4l/     zero
hda3      kmem      ptmx    rd/     tty17@    tty34@  tty51@  vc/
hda4      kmsg      pts/    rtc@    tty18@    tty35@  tty52@  vcc/
hda5      log=      ram0@   shm/    tty19@    tty36@  tty53@  vcs@
hda6      loop/     ram1@   snd/    tty2@     tty37@  tty54@  vcs1@

%ls /dev/pts
0  1  2  3  4

A possible idea is to start konsole from another terminal (xterm) and see what the output from konsole is.
Comment 5 Adam 2004-08-19 17:45:02 UTC
Kurt - it looks like you're using udev in Gentoo without using the default of creating the extra nodes.  You must have RC_DEVICE_TARBALL in /etc/conf.d/rc set to "no", whereas I had it set to "yes" until recently.

> A possible idea is to start konsole from another terminal (xterm) and see what the output from konsole is. 

That's a good idea - unfortunately, as I said, the bug only happens sometimes, so I'd have to keep trying to get it to happen.
Comment 6 Waldo Bastian 2004-08-19 17:53:50 UTC
Adam: Yes, I agree, it should not freeze.

Can you attach gdb from another konsole or xterm when it freezes and post the backtrace? What does "ps aux" have to say about the process when it freezes?
Comment 7 Kurt Hindenburg 2004-08-19 18:10:33 UTC
Yea, RC_DEVICE_TARBALL="no"

Do you still have the problem when this is set to NO?
Comment 8 Adam 2004-08-19 21:30:03 UTC
> Yea, RC_DEVICE_TARBALL="no"

> Do you still have the problem when this is set to NO?

No, as I said previously - at least I haven't been able to reproduce it.

> Can you attach gdb from another konsole or xterm when it freezes and post the backtrace? What does "ps aux" have to say about the process when it freezes?

I suppose I could try these; I'll have to reboot before I can make the bug happen again.
Comment 9 Adam 2004-08-19 21:51:11 UTC
Sorry, even though I set RC_DEVICE_TARBALL back to "yes" and restarted my computer, the tons of devices didn't re-appear.  It seems like there's no going back once you set it to "no", since that setting saves and restores the devices in /dev.  I don't know of an easy way to get them back, so I can't make the bug happen anymore.

Anyway, to debug it with gdb would require that debugging information be in the konsole executable, wouldn't it?  I doubt it's there by default.
Comment 10 Kurt Hindenburg 2005-03-29 06:58:51 UTC
Do you still have problems?
Comment 11 Adam 2005-03-29 15:16:02 UTC
No, as I said I am unable to reproduce it - but it might still be there for people who chose RC_DEVICE_TARBALL as "yes" from the beginning.
Comment 12 Kurt Hindenburg 2005-03-29 21:45:48 UTC
I agree that Konsole should just quit with an error.
Comment 13 Kurt Hindenburg 2005-04-06 17:38:13 UTC
CVS commit by hindenburg: 

Exit somewhat gracefully when unable to allocate a PTY.

BUG: 87481


  M +1 -1      TEPty.cpp   1.89
  M +11 -2     session.cpp   1.102


--- kdebase/konsole/konsole/TEPty.cpp  #1.88:1.89
@@ -123,5 +123,5 @@ int TEPty::run(const char* _pgm, QStrLis
   setUsePty(All, _addutmp);
 
-  if (!start(NotifyOnExit, (Communication) (Stdin | Stdout)))
+  if ( start(NotifyOnExit, (Communication) (Stdin | Stdout)) == false )
      return -1;
 

--- kdebase/konsole/konsole/session.cpp  #1.101:1.102
@@ -97,4 +97,7 @@ TESession::TESession(TEWidget* _te, cons
 void TESession::ptyError()
 {
+  if ( sh->error().isEmpty() )
+    KMessageBox::error(te->topLevelWidget(), i18n("Unable to allocate a pseudo teletype!"));
+  else
   KMessageBox::error(te->topLevelWidget(), sh->error());
   emit done(this);
@@ -131,7 +134,13 @@ void TESession::run()
      QDir::setCurrent(initial_cwd);
   sh->setXonXoff(xon_xoff);
-  sh->run(QFile::encodeName(pgm), args, term.latin1(), winId, add_to_utmp,
+
+  int result = sh->run(QFile::encodeName(pgm), args, term.latin1(), 
+          winId, add_to_utmp,
           ("DCOPRef("+appId+",konsole)").latin1(),
           ("DCOPRef("+appId+","+sessionId+")").latin1());
+  if (result < 0) {     // Error in creating pseudo teletype
+    kdWarning()<<"Unable to allocate a pseudo teletype!"<<endl;
+    QTimer::singleShot(0, this, SLOT(ptyError()));
+  }
   if (!initial_cwd.isEmpty())
      QDir::setCurrent(cwd_save);
Comment 14 Adam 2005-04-06 21:27:39 UTC
Great!  That certainly looks like an improvement.

However, a suggestion - you may want to try to make that error message more understandable.  Many people aren't going to know what the heck a "pseudo teletype" is, or even what "allocate" means.
Comment 15 Kurt Hindenburg 2005-04-07 06:39:36 UTC
What would you suggest?  When it happened to you, what error message would you have understood?  I can understand people not recognizing 'pseudo teletype' == PTY
Comment 16 Kurt Hindenburg 2005-04-07 06:45:16 UTC
*** Bug 99618 has been marked as a duplicate of this bug. ***
Comment 17 Kurt Hindenburg 2005-04-08 20:41:42 UTC
CVS commit by hindenburg: 

Try to be more helpful when unable to open a PTY.

CCBUGS: 87481


  M +6 -3      session.cpp   1.103


--- kdebase/konsole/konsole/session.cpp  #1.102:1.103
@@ -98,5 +98,8 @@ void TESession::ptyError()
 {
   if ( sh->error().isEmpty() )
-    KMessageBox::error(te->topLevelWidget(), i18n("Unable to allocate a pseudo teletype!"));
+    KMessageBox::detailedError( te->topLevelWidget(),
+       i18n("Konsole is unable to open a pseudo teletype!"),
+       i18n("Konsole needs to have read/write access to the system's PTYs.  The exact location of the PTYs depends on the Linux kernel version (2.4.x or 2.6.x), the kernel configuration and which dev-filesystem (devfs or udev) is being used.  \nUsing Linux kernel 2.6.x and udev this is typically /dev/pts/"), 
+       i18n("A fatal error has occurred!") );
   else
     KMessageBox::error(te->topLevelWidget(), sh->error());
@@ -139,6 +142,6 @@ void TESession::run()
           ("DCOPRef("+appId+",konsole)").latin1(),
           ("DCOPRef("+appId+","+sessionId+")").latin1());
-  if (result < 0) {     // Error in creating pseudo teletype
-    kdWarning()<<"Unable to allocate a pseudo teletype!"<<endl;
+  if (result < 0) {     // Error in opening pseudo teletype
+    kdWarning()<<"Unable to open a pseudo teletype!"<<endl;
     QTimer::singleShot(0, this, SLOT(ptyError()));
   }
Comment 18 Adam 2005-04-08 23:44:14 UTC
Sorry for the delay.  That error message seems much better - though frankly, I was never completely sure what was causing it to be unable to allocate a pty (which means you probably know better than me what the error message should be).  So you believe it's the read/write permissions on the devices in /dev/pts?

I see now that what's happening on my system is that when Konsole opens, a new device is created in /dev/pts, with permissions being denied to everyone except the owner of the new Konsole process.  So when I had RC_DEVICE_TARBALL set to "yes", there were already the maximum number of ptys created, and perhaps also only root had permission to use any of them.  It seems to me that this is a fundamental limitation at a lower level; i.e. there should really be no limit on how many ptys the system can create.  If that's the case, it's true that I don't think Konsole can do much better than to give an error message that mentions ptys and is incomprensible to some users; though in the long run I'd say someone should fix that limitation.

Some things you might want to add to the error message, though - a notification that "Konsole could not be started because...", as well as "This is probably due to incorrect configuration of the pty devices".
Comment 19 Kurt Hindenburg 2005-04-11 06:45:41 UTC
CVS commit by hindenburg: 

Simplify PTY error message.

CCBUGS: 87481


  M +3 -3      session.cpp   1.104


--- kdebase/konsole/konsole/session.cpp  #1.103:1.104
@@ -97,8 +97,8 @@ TESession::TESession(TEWidget* _te, cons
 void TESession::ptyError()
 {
+  // FIXME:  sh->error() is always empty
   if ( sh->error().isEmpty() )
-    KMessageBox::detailedError( te->topLevelWidget(),
-       i18n("Konsole is unable to open a pseudo teletype!"),
-       i18n("Konsole needs to have read/write access to the system's PTYs.  The exact location of the PTYs depends on the Linux kernel version (2.4.x or 2.6.x), the kernel configuration and which dev-filesystem (devfs or udev) is being used.  \nUsing Linux kernel 2.6.x and udev this is typically /dev/pts/"), 
+    KMessageBox::error( te->topLevelWidget(),
+       i18n("Konsole is unable to open a PTY (pseudo teletype)!  This is likely due to an incorrect configuration of the PTY devices.  Konsole needs to have read/write access to the PTY devices."), 
        i18n("A fatal error has occurred!") );
   else