Bug 259800 - Plasma desktop crashes randomly in 64 bit desktop
Summary: Plasma desktop crashes randomly in 64 bit desktop
Status: RESOLVED NOT A BUG
Alias: None
Product: kwin
Classification: Plasma
Component: general (show other bugs)
Version: unspecified
Platform: Ubuntu Linux
: NOR crash
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-14 02:57 UTC by Leo Milano
Modified: 2010-12-19 00:49 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Leo Milano 2010-12-14 02:57:19 UTC
Application: kwin (4.5.4 (KDE 4.5.4))
KDE Platform Version: 4.5.4 (KDE 4.5.4)
Qt Version: 4.7.0
Operating System: Linux 2.6.35-23-server x86_64
Distribution: Ubuntu 10.10

-- Information about the crash:
- What I was doing when the application crashed: a plasma crash happens every once in a while when running flash based applications, and some times in other contexts. It happens to all users in the machine. This seems to happen both with the fglrx and the open source ATI radeon drivers. Many times plasma settings are overwritten and important configuration files for plasma, kmail and other components are completely lost.

The crash can be reproduced some of the time.

-- Backtrace:
Application: KWin (kwin), signal: Segmentation fault
[Current thread is 1 (Thread 0x7fc8d36ed780 (LWP 4065))]

Thread 2 (Thread 0x7fc8b14d9700 (LWP 4087)):
#0  0x00007fc8d30732c3 in select () at ../sysdeps/unix/syscall-template.S:82
#1  0x00007fc8cf9c076e in qt_safe_select (nfds=17, fdread=0x22a9ce0, fdwrite=0x22a9f78, fdexcept=0x22aa210, orig_timeout=0x0) at kernel/qcore_unix.cpp:82
#2  0x00007fc8cf9c5beb in QEventDispatcherUNIXPrivate::doSelect (this=0x22a9b20, flags=<value optimized out>, timeout=<value optimized out>) at kernel/qeventdispatcher_unix.cpp:219
#3  0x00007fc8cf9c681b in QEventDispatcherUNIX::processEvents (this=0x21a5ba0, flags=) at kernel/qeventdispatcher_unix.cpp:919
#4  0x00007fc8cf995a02 in QEventLoop::processEvents (this=<value optimized out>, flags=) at kernel/qeventloop.cpp:149
#5  0x00007fc8cf995dec in QEventLoop::exec (this=0x7fc8b14d8d90, flags=) at kernel/qeventloop.cpp:201
#6  0x00007fc8cf8a02fd in QThread::exec (this=<value optimized out>) at thread/qthread.cpp:490
#7  0x00007fc8cf9755f8 in QInotifyFileSystemWatcherEngine::run (this=0x22a8b10) at io/qfilesystemwatcher_inotify.cpp:248
#8  0x00007fc8cf8a327e in QThreadPrivate::start (arg=0x22a8b10) at thread/qthread_unix.cpp:266
#9  0x00007fc8cf618971 in start_thread (arg=<value optimized out>) at pthread_create.c:304
#10 0x00007fc8d307a92d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7fc8d36ed780 (LWP 4065)):
[KCrash Handler]
#6  KAboutData::appName (this=0x1ea2b38) at ../../kdecore/kernel/kaboutdata.cpp:664
#7  0x00007fc8d0085c61 in KComponentData::componentName (this=<value optimized out>) at ../../kdecore/kernel/kcomponentdata.cpp:225
#8  0x00007fc8d2baa16a in KNotification::sendEvent (this=0x1f08830) at ../../kdeui/notifications/knotification.cpp:369
#9  0x00007fc8d2baa47c in KNotification::qt_metacall (this=0x1f08830, _c=QMetaObject::InvokeMetaMethod, _id=<value optimized out>, _a=0x2132eb0) at ./knotification.moc:109
#10 0x00007fc8cf9a8bde in QObject::event (this=0x1f08830, e=0x7fff4c08bea0) at kernel/qobject.cpp:1219
#11 0x00007fc8ceae0fdc in QApplicationPrivate::notify_helper (this=0x1e0f2d0, receiver=0x1f08830, e=0x1f1a740) at kernel/qapplication.cpp:4396
#12 0x00007fc8ceae6aed in QApplication::notify (this=0x7fff4c08c9b0, receiver=0x1f08830, e=0x1f1a740) at kernel/qapplication.cpp:4277
#13 0x00007fc8d2b74576 in KApplication::notify (this=0x7fff4c08c9b0, receiver=0x1f08830, event=0x1f1a740) at ../../kdeui/kernel/kapplication.cpp:310
#14 0x00007fc8cf996cdc in QCoreApplication::notifyInternal (this=0x7fff4c08c9b0, receiver=0x1f08830, event=0x1f1a740) at kernel/qcoreapplication.cpp:732
#15 0x00007fc8cf999c22 in sendEvent (receiver=0x0, event_type=<value optimized out>, data=0x1dec1d0) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:215
#16 QCoreApplicationPrivate::sendPostedEvents (receiver=0x0, event_type=<value optimized out>, data=0x1dec1d0) at kernel/qcoreapplication.cpp:1373
#17 0x00007fc8ceb94a44 in sendPostedEvents (this=<value optimized out>, flags=) at ../../include/QtCore/../../src/corelib/kernel/qcoreapplication.h:220
#18 QEventDispatcherX11::processEvents (this=<value optimized out>, flags=) at kernel/qeventdispatcher_x11.cpp:75
#19 0x00007fc8cf995a02 in QEventLoop::processEvents (this=<value optimized out>, flags=) at kernel/qeventloop.cpp:149
#20 0x00007fc8cf995dec in QEventLoop::exec (this=0x7fff4c08c8f0, flags=) at kernel/qeventloop.cpp:201
#21 0x00007fc8cf999ebb in QCoreApplication::exec () at kernel/qcoreapplication.cpp:1009
#22 0x00007fc8d3369f9d in kdemain () from /usr/lib/kde4/libkdeinit/libkdeinit4_kwin.so
#23 0x00007fc8d2fb2d8e in __libc_start_main (main=<value optimized out>, argc=<value optimized out>, ubp_av=<value optimized out>, init=<value optimized out>, fini=<value optimized out>, rtld_fini=<value optimized out>, stack_end=0x7fff4c08cfb8) at libc-start.c:226
#24 0x0000000000400669 in _start ()

Reported using DrKonqi
Comment 1 Thomas Lübking 2010-12-14 16:17:31 UTC
errr..
"Many times plasma
settings are overwritten and important configuration files for plasma, kmail
and other components are completely lost."

Are you saying this is the outcome of the segfault in the windowmanager (kwin) or do random components of you KDE desktop (like eg. plasma-desktop or kmail) crash and then loose their temporary settings (because of the unclean exit)

Flash + GL compositing in kwin causes indeed segfaults in some drivers or mesa, but that would cause other backtraces. This one looks related to KNotification (information popups, audio hints, i.e. "ching") and a bug in this lib would actually affect the entire desktop.
Comment 2 Leo Milano 2010-12-15 03:00:03 UTC
Hi Thomas,

I have disabled Desktop effects long ago just in case, because as you said, it's a possible source of issues. 

Typically, different components crash, sometimes the network management applet, sometimes plasma-desktop (and I can actually [alt-f2] and re run it), some times the system freezes (which I think means kwin segfaulting). Sometimes, a component crashes and I get the window to report a bug, and before I can do that, or just close that pop up, kwin crashes. 

So, whatever course of events, it seems like the config files are left in an inconsistent state whenever the desktop freezes and you need to power-cycle the computer. 

Many thanks!
Leo
Comment 3 Thomas Lübking 2010-12-15 13:03:36 UTC
First off: whenever you get a backtrace, does it always look like this? (in doubt: attach some ;-)

Whenever the "desktop freezes" try to ctrl+alt+f1 -> hopefully takes you to a VT, if not, there's a problem in X11 (or the kernel halts) and you'll have to do the "reisub" sysrequest. (since you're not using compositing, the lack of visual updates can NOT be caused by kwin - except it grabbed the server, but that's unlikely)

If there're visual updates (clock?) but you cannot change the active window and you can navigate to a VT, login "export DISPLAY=:0" and run "kwin --replace&".
Then use "ctrl+alt+F7" to switch back.

Last one: depending on your GPU/driver () you might wish to turn off KMS (pass "nomodeset radeon.modeset=0" to the kernel parameters at GRUB), but without further information, it's impossible to say where this bug source and it could even be caused by a broken HDD or broken RAM as well :-\
(run memtest and check SMART reports to rule out this)

Final note: even when X11 (the GUI) is completely out of order, try to cleanly shutdown the box from a VT, if you cannot even acces the VT anymore and happen to run an sshd and have a second computer, try to login via ssh and shut it down from there.
Pressing the reset button is about the last thing you should do - ever.
Comment 4 Leo Milano 2010-12-16 14:07:01 UTC
Thanks so much, Thomas, that's a good idea (attaching more backtraces). Sorry for not mentioning it, but yes, I have tried switching to another VT (ctrl+alt+f1 and such). This works when a KDE component crashes, but not if I reach the desktop freeze. I can't ssh with my machines for some network setup issue I have no time to look at :) Overall, I always try to avoid power cycling, and all your suggestions are great, but these seem like real kernel level crashes.

I will keep looking at this. Thanks for all the ideas. Yesterday I uninstalled some older KDE packages that needed to be kept installed to run ksensors. I'll see if that helps. I'll also try with and without KMS, though these crashes happen both with fglrx and radeon, so I am thinking it is something else. And you are right, it could be a hardware issue. If we conclude this is not a KDE issue I'll close this bug report right away (I am really trying to identify the bug, if any)

Cheers!
Leo
Comment 5 Leo Milano 2010-12-17 04:37:36 UTC
Hi,

A little update. Yesterday I removed the following OLD packages (they were being held up because I had ksensors installed from previous versions of kubuntu)

lmilano@grisell:~$ sudo dpkg -l |grep kdelibs4c2a
rc  kdelibs4c2a                                       4:3.5.10.dfsg.1-3ubuntu2.10.10.1                  core libraries and binaries for all KDE applications
lmilano@grisell:~$ sudo dpkg -l |grep kdelibs-data
rc  kdelibs-data                                      4:3.5.10.dfsg.1-3ubuntu2.10.10.1                  core shared data for all KDE applications

The system has been stable so far. I wonder if somehow these old packages were providing shared libraries that were being picked up by newer KDE packages (I noticed you guys use a pimpl pattern in some parts of the code), and probably reading uninitialized memory.

Anyways, let's see if it keeps being stable, I'll check back in a while. 

Thank you!
Comment 6 Martin Flöser 2010-12-18 11:30:39 UTC
seems to be caused by a lib conflict. In case the removal did not help, feel free to reopen.
Comment 7 Leo Milano 2010-12-19 00:49:19 UTC
Yes, Martin, this makes sense. So far, the machine has been rock solid. Thank you, Thomas , for all the debugging ideas, you rock :)

Cheers!
Leo