Bug 382283 - Restart of KWin fails due to OpenGL safepoint thread running
Summary: Restart of KWin fails due to OpenGL safepoint thread running
Status: RESOLVED FIXED
Alias: None
Product: kwin
Classification: Plasma
Component: platform-x11-standalone (show other bugs)
Version: 5.10.3
Platform: Arch Linux Linux
: NOR crash
Target Milestone: ---
Assignee: KWin default assignee
URL: https://phabricator.kde.org/D6735
Keywords: drkonqi
: 382385 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-07-12 17:24 UTC by tesfabpel
Modified: 2017-07-17 15:07 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 5.10.4
mgraesslin: Wayland-
mgraesslin: X11+
mgraesslin: ReviewRequest+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description tesfabpel 2017-07-12 17:24:16 UTC
Application: kwin_x11 (5.10.3)

Qt Version: 5.9.1
Frameworks Version: 5.35.0
Operating System: Linux 4.11.7-1-ARCH x86_64
Distribution: "Arch Linux"

-- Information about the crash:
Sometimes when opening some apps (mostly Chrome, Steam, and others) or doing some other things, kwin crashes.
I'm using ArchLinux x64 and NVIDIA proprietary graphics drivers 381.22 (sorry about that :()

The crash can be reproduced sometimes.

-- Backtrace:
Application: KWin (kwin_x11), signal: Aborted
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[Current thread is 1 (Thread 0x7f0f1de7d840 (LWP 1189))]

Thread 5 (Thread 0x7f0f008fe700 (LWP 2280)):
#0  0x00007f0f1d928326 in ppoll () at /usr/lib/libc.so.6
#1  0x00007f0f1af471a1 in qt_safe_poll(pollfd*, unsigned long, timespec const*) () at /usr/lib/libQt5Core.so.5
#2  0x00007f0f1af488be in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#3  0x00007f0f1aef0efa in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#4  0x00007f0f1ad1079a in QThread::exec() () at /usr/lib/libQt5Core.so.5
#5  0x00007f0f1ad1537d in  () at /usr/lib/libQt5Core.so.5
#6  0x00007f0f16ad9297 in start_thread () at /usr/lib/libpthread.so.0
#7  0x00007f0f1d9321ef in clone () at /usr/lib/libc.so.6

Thread 4 (Thread 0x7f0ee9986700 (LWP 1411)):
#0  0x00007f0f16adf39d in pthread_cond_wait@@GLIBC_2.3.2 () at /usr/lib/libpthread.so.0
#1  0x00007f0f19e6db04 in  () at /usr/lib/libQt5Script.so.5
#2  0x00007f0f19e6db49 in  () at /usr/lib/libQt5Script.so.5
#3  0x00007f0f16ad9297 in start_thread () at /usr/lib/libpthread.so.0
#4  0x00007f0f1d9321ef in clone () at /usr/lib/libc.so.6

Thread 3 (Thread 0x7f0f0215d700 (LWP 1304)):
#0  0x00007f0f1d928326 in ppoll () at /usr/lib/libc.so.6
#1  0x00007f0f1af471a1 in qt_safe_poll(pollfd*, unsigned long, timespec const*) () at /usr/lib/libQt5Core.so.5
#2  0x00007f0f1af488be in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#3  0x00007f0f1aef0efa in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#4  0x00007f0f1ad1079a in QThread::exec() () at /usr/lib/libQt5Core.so.5
#5  0x00007f0f1499ad45 in  () at /usr/lib/libQt5DBus.so.5
#6  0x00007f0f1ad1537d in  () at /usr/lib/libQt5Core.so.5
#7  0x00007f0f16ad9297 in start_thread () at /usr/lib/libpthread.so.0
#8  0x00007f0f1d9321ef in clone () at /usr/lib/libc.so.6

Thread 2 (Thread 0x7f0f04389700 (LWP 1201)):
#0  0x00007f0f1d92824d in poll () at /usr/lib/libc.so.6
#1  0x00007f0f1ca118e0 in  () at /usr/lib/libxcb.so.1
#2  0x00007f0f1ca13679 in xcb_wait_for_event () at /usr/lib/libxcb.so.1
#3  0x00007f0f0546de99 in  () at /usr/lib/libQt5XcbQpa.so.5
#4  0x00007f0f1ad1537d in  () at /usr/lib/libQt5Core.so.5
#5  0x00007f0f16ad9297 in start_thread () at /usr/lib/libpthread.so.0
#6  0x00007f0f1d9321ef in clone () at /usr/lib/libc.so.6

Thread 1 (Thread 0x7f0f1de7d840 (LWP 1189)):
[KCrash Handler]
#5  0x00007f0f1d878670 in raise () at /usr/lib/libc.so.6
#6  0x00007f0f1d879d00 in abort () at /usr/lib/libc.so.6
#7  0x00007f0f1ad002c7 in  () at /usr/lib/libQt5Core.so.5
#8  0x00007f0f1ad0f6ed in QThread::~QThread() () at /usr/lib/libQt5Core.so.5
#9  0x00007f0f1ad0f749 in QThread::~QThread() () at /usr/lib/libQt5Core.so.5
#10 0x00007f0f1af1f40b in QObjectPrivate::deleteChildren() () at /usr/lib/libQt5Core.so.5
#11 0x00007f0f1af28d9b in QObject::~QObject() () at /usr/lib/libQt5Core.so.5
#12 0x00007f0f01738309 in KWin::X11StandalonePlatform::~X11StandalonePlatform() () at /usr/lib/qt/plugins/org.kde.kwin.platforms/KWinX11Platform.so
#13 0x00007f0f1af1f40b in QObjectPrivate::deleteChildren() () at /usr/lib/libQt5Core.so.5
#14 0x00007f0f1af28d9b in QObject::~QObject() () at /usr/lib/libQt5Core.so.5
#15 0x00007f0f1aef4b76 in QCoreApplication::~QCoreApplication() () at /usr/lib/libQt5Core.so.5
#16 0x00007f0f1bc18ec9 in QApplication::~QApplication() () at /usr/lib/libQt5Widgets.so.5
#17 0x00007f0f1dbf1743 in  () at /usr/lib/libkdeinit5_kwin_x11.so
#18 0x00007f0f1dbf3029 in kdemain () at /usr/lib/libkdeinit5_kwin_x11.so
#19 0x00007f0f1d86543a in __libc_start_main () at /usr/lib/libc.so.6
#20 0x000000000040069a in _start ()

Reported using DrKonqi
Comment 1 Martin Flöser 2017-07-12 18:59:39 UTC
As always backtraces from Arch are useless.
Comment 2 tesfabpel 2017-07-12 19:04:22 UTC
I know but this time drkonqi said that backtraces were useful and gave 3 stars so I tried... :(
I will try again after recompiling with debug symbols...
Comment 3 tesfabpel 2017-07-13 10:39:17 UTC
Here's the new backtrace (this time when opening Unreal Tournament 4), I hope it's useful:


Application: KWin (kwin_x11), signal: Aborted
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[Current thread is 1 (Thread 0x7f66b8ba4840 (LWP 12181))]

Thread 4 (Thread 0x7f66975ef700 (LWP 13859)):
#0  0x00007f66b864f326 in ppoll () at /usr/lib/libc.so.6
#1  0x00007f66b5c753f1 in qt_ppoll (timeout_ts=0x0, nfds=1, fds=0x7f668c015338) at kernel/qcore_unix.cpp:81
#2  0x00007f66b5c753f1 in qt_safe_poll(pollfd*, unsigned long, timespec const*) (fds=0x7f668c015338, nfds=1, timeout_ts=timeout_ts@entry=0x0) at kernel/qcore_unix.cpp:102
#3  0x00007f66b5c76b0e in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) (this=<optimized out>, flags=...) at kernel/qeventdispatcher_unix.cpp:500
#4  0x00007f66b5c1e34a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) (this=this@entry=0x7f66975eed70, flags=..., flags@entry=...) at kernel/qeventloop.cpp:212
#5  0x00007f66b5a39bea in QThread::exec() (this=<optimized out>) at thread/qthread.cpp:515
#6  0x00007f66b5a3e82d in QThreadPrivate::start(void*) (arg=0x1ac7640) at thread/qthread_unix.cpp:368
#7  0x00007f66b1802297 in start_thread () at /usr/lib/libpthread.so.0
#8  0x00007f66b86591ef in clone () at /usr/lib/libc.so.6

Thread 3 (Thread 0x7f66892db700 (LWP 12447)):
#0  0x00007f66b180839d in pthread_cond_wait@@GLIBC_2.3.2 () at /usr/lib/libpthread.so.0
#1  0x00007f66b4b96b04 in  () at /usr/lib/libQt5Script.so.5
#2  0x00007f66b4b96b49 in  () at /usr/lib/libQt5Script.so.5
#3  0x00007f66b1802297 in start_thread () at /usr/lib/libpthread.so.0
#4  0x00007f66b86591ef in clone () at /usr/lib/libc.so.6

Thread 2 (Thread 0x7f669ce86700 (LWP 12288)):
#0  0x00007f66b864f326 in ppoll () at /usr/lib/libc.so.6
#1  0x00007f66b5c753f1 in qt_ppoll (timeout_ts=0x0, nfds=1, fds=0x7f669000ad68) at kernel/qcore_unix.cpp:81
#2  0x00007f66b5c753f1 in qt_safe_poll(pollfd*, unsigned long, timespec const*) (fds=0x7f669000ad68, nfds=1, timeout_ts=timeout_ts@entry=0x0) at kernel/qcore_unix.cpp:102
#3  0x00007f66b5c76b0e in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) (this=<optimized out>, flags=...) at kernel/qeventdispatcher_unix.cpp:500
#4  0x00007f66b5c1e34a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) (this=this@entry=0x7f669ce85d40, flags=..., flags@entry=...) at kernel/qeventloop.cpp:212
#5  0x00007f66b5a39bea in QThread::exec() (this=this@entry=0x7f66af937d60 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread.cpp:515
#6  0x00007f66af6c3d45 in QDBusConnectionManager::run() (this=0x7f66af937d60 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at qdbusconnection.cpp:178
#7  0x00007f66b5a3e82d in QThreadPrivate::start(void*) (arg=0x7f66af937d60 <(anonymous namespace)::Q_QGS__q_manager::innerFunction()::holder>) at thread/qthread_unix.cpp:368
#8  0x00007f66b1802297 in start_thread () at /usr/lib/libpthread.so.0
#9  0x00007f66b86591ef in clone () at /usr/lib/libc.so.6

Thread 1 (Thread 0x7f66b8ba4840 (LWP 12181)):
[KCrash Handler]
#5  0x00007f66b859f670 in raise () at /usr/lib/libc.so.6
#6  0x00007f66b85a0d00 in abort () at /usr/lib/libc.so.6
#7  0x00007f66b5a2959f in qt_message_fatal (context=..., message=<synthetic pointer>...) at global/qlogging.cpp:1690
#8  0x00007f66b5a2959f in QMessageLogger::fatal(char const*, ...) const (this=this@entry=0x7fff0f2db710, msg=msg@entry=0x7f66b5ccd078 "QThread: Destroyed while thread is still running") at global/qlogging.cpp:796
#9  0x00007f66b5a38afd in QThread::~QThread() (this=0x1ac7640, __in_chrg=<optimized out>) at thread/qthread.cpp:429
#10 0x00007f66b5a38b59 in QThread::~QThread() (this=0x1ac7640, __in_chrg=<optimized out>) at thread/qthread.cpp:433
#11 0x00007f66b5c4cf7b in QObjectPrivate::deleteChildren() (this=this@entry=0x1903820) at kernel/qobject.cpp:1992
#12 0x00007f66b5c56b2b in QObject::~QObject() (this=<optimized out>, __in_chrg=<optimized out>) at kernel/qobject.cpp:1022
#13 0x00007f669c4612d9 in KWin::X11StandalonePlatform::~X11StandalonePlatform() (this=0x1904070, __in_chrg=<optimized out>) at /mnt/CommonExtraLinuxFiles/tmp/gnao/kwin/src/kwin-5.10.3.1/plugins/platforms/x11/standalone/x11_platform.h:33
#14 0x00007f66b5c4cf7b in QObjectPrivate::deleteChildren() (this=this@entry=0x18722d0) at kernel/qobject.cpp:1992
#15 0x00007f66b5c56b2b in QObject::~QObject() (this=<optimized out>, __in_chrg=<optimized out>) at kernel/qobject.cpp:1022
#16 0x00007f66b5c21fb6 in QCoreApplication::~QCoreApplication() (this=0x7fff0f2dbae0, __in_chrg=<optimized out>) at kernel/qcoreapplication.cpp:851
#17 0x00007f66b693ee99 in QApplication::~QApplication() (this=0x7fff0f2dbae0, __in_chrg=<optimized out>) at kernel/qapplication.cpp:795
#18 0x00007f66b89187a3 in KWin::ApplicationX11::~ApplicationX11() (this=0x7fff0f2dbae0, __in_chrg=<optimized out>) at /mnt/CommonExtraLinuxFiles/tmp/gnao/kwin/src/kwin-5.10.3.1/main_x11.cpp:187
#19 0x00007f66b891a089 in kdemain(int, char**) (argc=<optimized out>, argv=0x7fff0f2dbc78) at /mnt/CommonExtraLinuxFiles/tmp/gnao/kwin/src/kwin-5.10.3.1/main_x11.cpp:411
#20 0x00007f66b858c43a in __libc_start_main () at /usr/lib/libc.so.6
#21 0x00000000004006ea in _start ()
Comment 4 Martin Flöser 2017-07-13 14:48:01 UTC
it's a crash when KWin exits. Are the games trying to quit KWin?
Comment 5 tesfabpel 2017-07-13 16:20:10 UTC
Seems strange though...

I noticed these things:
- Just after login, if I open Chrome right away, kwin may crash.
- If I open UT4, kwin may crash.
- If I relaunch Kwin using ALT+F2 -> `kwin_x11 --replace &` it crashes less often (or never). Could it be that the first instance of kwin has something weird that new ones haven't?
- When Kwin crashes, a notification appears and says (more or less): "Desktop effects are broken and they will be restarted". But then, nothing happens.
- With a previous version, when this happened, desktop effects restarted correctly (this is why I decided to open this bug).

Could it be caused by a creation of a OpenGL context (don't know about Vulkan)?
By the way, I suspect the culprit here is the NVIDIA proprietary driver, but kwin should still handle it correctly...
Comment 6 Martin Flöser 2017-07-13 17:07:49 UTC
Are you using an NVIDIA card? If yes are you on KWin 5.10.3.1 or 5.10.3?
Comment 7 tesfabpel 2017-07-13 21:05:36 UTC
Yes, I'm using an NVIDIA EVGA GTX 970 SSC (driver version: 381.22).
kwin 5.10.3.1
Comment 8 Martin Flöser 2017-07-14 04:16:38 UTC
I'm starting to get an idea. The compositor breaks in the NVIDIA driver and we try to restart the complete window manager. This fails due to a thread running in the X11 platform
Comment 9 Martin Flöser 2017-07-15 18:53:30 UTC
*** Bug 382385 has been marked as a duplicate of this bug. ***
Comment 10 Martin Flöser 2017-07-16 15:45:28 UTC
I do not really understand why KWin is being shut down. We used to have code which restarted into XRender, but that seems to be gone. So I don't understand why KWin gets closed.
Comment 11 Martin Flöser 2017-07-16 16:05:06 UTC
Patch for the abort at: https://phabricator.kde.org/D6735

If you have in addition any idea why KWin gets terminated, please let me know. If possible check .xsession-errors whether we have a message there.
Comment 12 Martin Flöser 2017-07-17 14:52:40 UTC
Git commit 06a558e3de658f300b295beac7c4adc4f08227f5 by Martin Flöser.
Committed on 16/07/2017 at 16:04.
Pushed by graesslin into branch 'Plasma/5.10'.

[platforms/x11] Quit the OpenGL Freeze protection thread on shutdown

Summary:
Weird NVIDIA behavior fixup part 2. Now that we do no longer freeze when
NVIDIA decides to create an OpenGL error on startup
(aefb5f4dd9d41aa7377d56ece203089c73aefe07), we experience a new issue.
KWin is terminating (no idea why, [1]) and at the same time the OpenGL freeze
protection thread is still running. So far we did not terminate the
thread on shutdown and thus we hit an abort in Qt.

This change ensures that we properly terminate the thread on shutdown.

[1] My current theory is that games terminate KWin, common pattern of
bug reports is "steam".
FIXED-IN: 5.10.4

Test Plan:
Tortured KWin by making sure I go through the code path,
saw the abort without the patch, no more abort with the patch

Reviewers: #kwin, #plasma

Subscribers: plasma-devel, kwin

Tags: #kwin

Differential Revision: https://phabricator.kde.org/D6735

M  +8    -1    plugins/platforms/x11/standalone/x11_platform.cpp

https://commits.kde.org/kwin/06a558e3de658f300b295beac7c4adc4f08227f5
Comment 13 tesfabpel 2017-07-17 15:07:40 UTC
Thanks a lot!
I've applied the patch and tried a little bit and it is still running without a crash.