Bug 402827 - kwin_wayland segfault on monitor wakeup
Summary: kwin_wayland segfault on monitor wakeup
Status: RESOLVED FIXED
Alias: None
Product: kwin
Classification: Plasma
Component: platform-drm (show other bugs)
Version: 5.18.0
Platform: Manjaro Linux
: NOR crash
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
: 402933 (view as bug list)
Depends on:
Blocks:
 
Reported: 2019-01-03 16:23 UTC by JordanL
Modified: 2021-11-05 09:39 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:
mgraesslin: Wayland+
mgraesslin: X11-


Attachments
System log from crash (28.47 KB, text/plain)
2019-01-03 16:23 UTC, JordanL
Details
New log after rebuilding kwin and mesa with debug symbols (262.91 KB, text/plain)
2019-01-04 14:13 UTC, JordanL
Details
Log of all threads before raise is called (27.05 KB, text/plain)
2019-01-04 16:51 UTC, JordanL
Details

Note You need to log in before you can comment on or make changes to this bug.
Description JordanL 2019-01-03 16:23:49 UTC
Created attachment 117266 [details]
System log from crash

SUMMARY
In Wayland session, when monitors go to sleep due to inactivity, when I wake it up again, kwin_wayland segfaults.

STEPS TO REPRODUCE
1. Log into Wayland session
2. Wait for inactivity to cause screens to go to sleep
3. Move mouse / press key

OBSERVED RESULT
Black screen(s), does not respond to ctrl+alt+f1, have to hard reboot.

EXPECTED RESULT
Monitors wake up, can carry on using session.

SOFTWARE/OS VERSIONS
KDE Plasma Version: 5.14.4
KDE Frameworks Version: 5.53.0
Qt Version: 5.12.0

ADDITIONAL INFORMATION
Distro: Manjaro Linux
Kernel: 4.19.13-1-MANJARO x86_64
CPU: Intel Core i7-5820K
GPU: AMD RX Vega 64, using AMDGPU driver
Mesa: 18.3.1
Monitors: 2 * 3840x2160@60Hz, connected with DP 1.2


I attached a log from the crash. The log starts when the monitors go to sleep, I pressed a key about 15 seconds later. The relevant part seems to be:

Jan 03 14:43:49 rupert kernel: kwin_wayland[15576]: segfault at 1c0 ip 00007fd1e312f853 sp 00007ffc3f2c1038 error 4 in libgbm.so.1.0.0[7fd1e312f000+6000]
Jan 03 14:43:49 rupert kernel: Code: 00 00 00 31 c0 48 83 c4 08 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8b 07 ff a0 48 01 00 00 0f 1f 80 00 00 00 00 48 8b 07 <ff> a0 30 01 00 00 0f 1f 80 00 00 00 00 48 8b 07 ff a0 38 01 00 00
Jan 03 14:43:49 rupert kernel: audit: type=1701 audit(1546526629.764:79): auid=1000 uid=1000 gid=1001 ses=4 pid=15576 comm="kwin_wayland" exe="/usr/bin/kwin_wayland" sig=11 res=1
Jan 03 14:43:49 rupert systemd[1]: Started Process Core Dump (PID 16033/UID 0). 
Jan 03 14:43:49 rupert audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@1-16033-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Jan 03 14:43:49 rupert kernel: audit: type=1130 audit(1546526629.781:80): pid=1 uid=0 auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@1-16033-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
(...)
Jan 03 14:43:52 rupert systemd-coredump[16034]: Core file was truncated to 2147483648 bytes.
Jan 03 14:43:53 rupert systemd-coredump[16034]: Process 15576 (kwin_wayland) of user 1000 dumped core.
    
                                                Stack trace of thread 15576:
                                                #0  0x00007fd1e312f853 n/a (n/a)
Comment 1 Martin Flöser 2019-01-03 20:00:00 UTC
Unfortunately the backtrace is lacking debug. If you are able to reproduce please install debug packages and try to get a backtrace. Best chance is to attach to running KWin through ssh.
Comment 2 JordanL 2019-01-04 11:08:50 UTC
Any idea which (if any) packages I need to rebuild with debug symbols? It looks like I need to rebuild mesa but I'm not 100% sure.

$ pacman -Qo /usr/lib/libgbm.so.1.0.0
/usr/lib/libgbm.so.1.0.0 is owned by mesa 18.3.1-1
Comment 3 JordanL 2019-01-04 13:06:03 UTC
Seems I also need to rebuild kwin with debug symbols (makes sense!). Think I'm close to having the backtrace now.
Comment 4 JordanL 2019-01-04 14:11:58 UTC
Sorry, even after installing kwin and mesa built with debug symbols, I get no backtrace and have no idea where to go next. I've attached the latest log I get (log4.txt).
Comment 5 JordanL 2019-01-04 14:13:10 UTC
Created attachment 117281 [details]
New log after rebuilding kwin and mesa with debug symbols
Comment 6 JordanL 2019-01-04 14:47:04 UTC
I worked out how to attach the debugger while kwin_wayland is still running. It throws SIGABRT when calling raise() in /usr/lib/libc.so.6 - so I will now rebuild glibc with debug symbols and see if that gets me anywhere.
Comment 7 JordanL 2019-01-04 16:28:51 UTC
Debugger log after rebuilding glibc with debug:


Continuing.
[Detaching after fork from child process 1548]

Thread 1 "kwin_wayland" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
Detaching from program: /usr/bin/kwin_wayland, process 1158
[Inferior 1 (process 1158) detached]
Comment 8 JordanL 2019-01-04 16:51:28 UTC
Created attachment 117283 [details]
Log of all threads before raise is called
Comment 9 JordanL 2019-01-04 16:53:29 UTC
Attached a new log, "Log of all threads before raise is called".

This is a backtrace of all kwin threads at the point it called raise(). I believe thread 1 is the thread that is relevant here:


Thread 1 (Thread 0x7f6809ac7440 (LWP 3559)):
#0  0x00007f6811888c70 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:28
#1  0x00007f6811873672 in __GI_abort () at abort.c:79
#2  0x00007f6811c547fc in  () at /usr/lib/libQt5Core.so.5
#3  0x00007f6811c53c83 in  () at /usr/lib/libQt5Core.so.5
#4  0x00007f6807b5a70e in QVector<KWin::EglGbmBackend::Output>::at(int) const () at /usr/lib/qt/plugins/org.kde.kwin.waylandbackends/KWinWaylandDrmBackend.so
#5  0x00007f6807b5a70e in KWin::EglGbmBackend::prepareRenderingForScreen(int) (this=<optimized out>, screenId=<optimized out>)
    at /usr/src/debug/kwin-5.14.4/plugins/platforms/drm/egl_gbm_backend.cpp:342
#6  0x00007f68072f370e in KWin::SceneOpenGL::paint(QRegion, QList<KWin::Toplevel*>) (this=this@entry=0x55f28420b560, damage=..., toplevels=...)
    at /usr/src/debug/kwin-5.14.4/plugins/scenes/opengl/scene_opengl.cpp:663
#7  0x00007f6812c327de in KWin::Compositor::performCompositing() (this=0x7f67f800ae50) at /usr/src/debug/kwin-5.14.4/composite.cpp:745
#8  0x00007f68087db705 in drmHandleEvent () at /usr/lib/libdrm.so.2
#9  0x00007f6807b5b09a in KWin::DrmBackend::<lambda()>::operator() (__closure=<optimized out>) at /usr/src/debug/kwin-5.14.4/plugins/platforms/drm/drm_backend.cpp:270
#10 0x00007f6807b5b09a in QtPrivate::FunctorCall<QtPrivate::IndexesList<>, QtPrivate::List<>, void, KWin::DrmBackend::openDrm()::<lambda()> >::call
    (arg=<optimized out>, f=...) at /usr/include/qt/QtCore/qobjectdefs_impl.h:146
#11 0x00007f6807b5b09a in QtPrivate::Functor<KWin::DrmBackend::openDrm()::<lambda()>, 0>::call<QtPrivate::List<>, void> (arg=<optimized out>, f=...)
    at /usr/include/qt/QtCore/qobjectdefs_impl.h:256
#12 0x00007f6807b5b09a in QtPrivate::QFunctorSlotObject<KWin::DrmBackend::openDrm()::<lambda()>, 0, QtPrivate::List<>, void>::impl(int, QtPrivate::QSlotObjectBase *, QObject *, void **, bool *) (which=<optimized out>, this_=<optimized out>, r=<optimized out>, a=<optimized out>, ret=<optimized out>)
    at /usr/include/qt/QtCore/qobjectdefs_impl.h:439
#13 0x00007f6811e753e0 in QMetaObject::activate(QObject*, int, int, void**) () at /usr/lib/libQt5Core.so.5
#14 0x00007f6811e80eea in QSocketNotifier::activated(int, QSocketNotifier::QPrivateSignal) () at /usr/lib/libQt5Core.so.5
#15 0x00007f6811e81242 in QSocketNotifier::event(QEvent*) () at /usr/lib/libQt5Core.so.5
#16 0x00007f6812256e34 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/libQt5Widgets.so.5
#17 0x00007f681225e671 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/libQt5Widgets.so.5
#18 0x00007f6811e4a8f9 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/libQt5Core.so.5
#19 0x00007f6811e9d710 in QEventDispatcherUNIXPrivate::activateSocketNotifiers() () at /usr/lib/libQt5Core.so.5
#20 0x00007f6811e9da19 in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#21 0x00007f68097c28be in QUnixEventDispatcherQPA::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/qt/plugins/platforms/KWinQpaPlugin.so
#22 0x00007f6811e4958c in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#23 0x00007f6811e51896 in QCoreApplication::exec() () at /usr/lib/libQt5Core.so.5
#24 0x000055f282fb8822 in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/kwin-5.14.4/main_wayland.cpp:807
Comment 10 Martin Flöser 2019-01-04 19:01:31 UTC
Thanks, that is a really good backtrace.
Comment 11 Martin Flöser 2019-01-05 09:00:36 UTC
My theory is that no screens are temporarily connected and we try to render. From what I can see in the code is that we possibly don't handle the situation correctly. It looks like we don't allow the count to go to 0, though there are 0 screens.
Comment 12 Martin Flöser 2019-01-05 11:55:11 UTC
a shot in the blue patch: https://phabricator.kde.org/D17985
Comment 13 JordanL 2019-01-05 12:10:34 UTC
I'll attempt to test this patch, any info you can point me to to help me rebuild kwin with it? Worst case I could manually apply the changes in the src and rebuild but I'm sure there's a quicker way!
Comment 14 JordanL 2019-01-05 18:02:21 UTC
"svn patch" would't play ball, so I manually applied the patch to 5.14.4 and rebuilt. The issue still occurs. The backtrace is different though, so I think this is progress. The backtrace differs from #8 onwards as far as I can see.


Thread 1 (Thread 0x7fe8a331e440 (LWP 1160)):
#0  0x00007fe8ab0dfc70 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:28
#1  0x00007fe8ab0ca672 in __GI_abort () at abort.c:79
#2  0x00007fe8ab4ab7fc in  () at /usr/lib/libQt5Core.so.5
#3  0x00007fe8ab4aac83 in  () at /usr/lib/libQt5Core.so.5
#4  0x00007fe8a13b170e in QVector<KWin::EglGbmBackend::Output>::at(int) const () at /usr/lib/qt/plugins/org.kde.kwin.waylandbackends/KWinWaylandDrmBackend.so
#5  0x00007fe8a13b170e in KWin::EglGbmBackend::prepareRenderingForScreen(int) (this=<optimized out>, screenId=<optimized out>) at /usr/src/debug/kwin-5.14.4/plugins/platforms/drm/egl_gbm_backend.cpp:342
#6  0x00007fe8a034970e in KWin::SceneOpenGL::paint(QRegion, QList<KWin::Toplevel*>) (this=this@entry=0x55db40fa39a0, damage=..., toplevels=...) at /usr/src/debug/kwin-5.14.4/plugins/scenes/opengl/scene_opengl.cpp:663
#7  0x00007fe8ac4897de in KWin::Compositor::performCompositing() (this=0x55db40989f30) at /usr/src/debug/kwin-5.14.4/composite.cpp:745
#8  0x00007fe8ab6ccb1b in QObject::event(QEvent*) () at /usr/lib/libQt5Core.so.5
#9  0x00007fe8abaade34 in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/libQt5Widgets.so.5
#10 0x00007fe8abab5671 in QApplication::notify(QObject*, QEvent*) () at /usr/lib/libQt5Widgets.so.5
#11 0x00007fe8ab6a18f9 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/libQt5Core.so.5
#12 0x00007fe8ab6f6955 in QTimerInfoList::activateTimers() () at /usr/lib/libQt5Core.so.5
#13 0x00007fe8ab6f4a9e in QEventDispatcherUNIX::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#14 0x00007fe8a30198be in QUnixEventDispatcherQPA::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/qt/plugins/platforms/KWinQpaPlugin.so
#15 0x00007fe8ab6a058c in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#16 0x00007fe8ab6a8896 in QCoreApplication::exec() () at /usr/lib/libQt5Core.so.5
#17 0x000055db3f62d852 in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/kwin-5.14.4/main_wayland.cpp:807
Comment 15 Martin Flöser 2019-01-05 18:32:24 UTC
It's still the same. Backtraces are top to bottom
Comment 16 Martin Flöser 2019-01-06 17:15:27 UTC
*** Bug 402933 has been marked as a duplicate of this bug. ***
Comment 17 Rainer Finke 2019-01-20 11:33:15 UTC
After the upgrade to Plasma 5.15 beta kwin doesn't segfault anymore, if the monitor is turned off via the power settings after e.g. 5 minutes. I can then just continue to work with the Plasma session on wayland after pressing a key on the keyboard. 

But it still doesn't work (with the external AMD GPU), if I turn off and on the monitor with the power button, then there is still a segfault of kwin.
Comment 18 JordanL 2019-01-21 13:46:52 UTC
Well that's progress at least, I might be able to use the Wayland session full time if so. Could you get the backtrace from the segfault when you turn the monitor off? In case it's different?
Comment 19 Rainer Finke 2019-01-23 07:07:16 UTC
The segfault is happening only when turning the monitor on. It seems like kwin doesn't get the monitor, at least not fast enough. The referenced bug report contains my segfault, but I will test again at the weekend.
Comment 20 JordanL 2019-02-16 18:16:05 UTC
Just retested on Arch with Plasma 5.15.0, KDE frameworks 5.55.0, Qt 5.12.1, same hardware as before.

Bug is still present, when monitors turn off due to power saving. kwin_wayland dumps core.
Comment 21 Rainer Finke 2019-05-16 21:52:21 UTC
Kwin is still crashing when turning on the monitor.

Operating System: Arch Linux 
KDE Plasma Version: 5.15.90
KDE Frameworks Version: 5.58.0
Qt Version: 5.13.0
Kernel Version: 5.1.2-arch1-1-ARCH
OS Type: 64-bit
Processors: 16 × AMD Ryzen 7 1700 Eight-Core Processor
Memory: 31,4 GiB of RAM
Comment 22 Rainer Finke 2019-10-12 07:53:11 UTC
Still crashing on Plasma 5.17 Beta when turning on the monitor with the power button.

Operating System: Arch Linux 
KDE Plasma Version: 5.16.90
KDE Frameworks Version: 5.62.0
Qt Version: 5.14.0
Kernel Version: 5.3.6-arch1-1-ARCH
OS Type: 64-bit
AMD GPU
Comment 23 Roman Gilg 2020-01-08 09:12:01 UTC
Git commit 2632e4182c658178af82be175575b094002468af by Roman Gilg.
Committed on 08/01/2020 at 09:12.
Pushed by romangilg into branch 'master'.

[platforms/drm] Allow running without outputs

Summary:
Set outputs enablement also when none outputs are present. This patch is
similar to earlier attempt at D17985.
Related: bug 389551, bug 398680, bug 413758

Test Plan:
Starting without outputs, manual disconnects and DPMS changes. There is still
an issue when an output gets disconnected while the DPMS is off. But it's an
improvement already.

Reviewers: #kwin, davidedmundson

Reviewed By: #kwin, davidedmundson

Subscribers: kwin

Tags: #kwin

Maniphest Tasks: T10016

Differential Revision: https://phabricator.kde.org/D26511

M  +2    -6    plugins/platforms/drm/drm_backend.cpp

https://commits.kde.org/kwin/2632e4182c658178af82be175575b094002468af
Comment 24 Stijn Tintel 2020-02-11 11:16:59 UTC
This is not fixed, it is even written in the commit message:

> There is still an issue when an output gets disconnected while the DPMS is off.

I have an iiyama b2888uhsu where I seem to hit that.
Comment 25 Rainer Finke 2020-02-18 13:15:31 UTC
Kwin crashed today after disconnecting and reconnecting the monitor and I was thrown back to SDDM.

Operating System: Arch Linux 
KDE Plasma Version: 5.18.0
KDE Frameworks Version: 5.67.0
Qt Version: 5.14.1
Kernel Version: 5.5.4-arch1-1
Comment 26 Rainer Finke 2020-05-29 10:34:45 UTC
Since Plasma 5.19 beta I can turn off and on the monitor with the power button without any crash of kwin. Seems like there is a fix that helped at least on my system.

Operating System: Arch Linux 
KDE Plasma Version: 5.18.90
KDE Frameworks Version: 5.70.0
Qt Version: 5.15.0
Kernel Version: 5.6.15-arch1-1
Comment 27 kde.org 2021-11-04 22:34:38 UTC
This bug report is quite old and Rainer Finke reported the issue resolved. Can anyone still reproduce this issue with KDE 5.23? If so, can you please install debugging packages following the info provided in  https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports, try to reproduce the bug and submit a backtrace with debugging information.
Comment 28 JordanL 2021-11-05 09:34:27 UTC
I can't reproduce it (since returning to Plasma Wayland with 5.22.5), however I did replace my monitors with new ones which have far fewer issues with this sort of thing than my previous ones.
Comment 29 kde.org 2021-11-05 09:39:04 UTC
Thank you for reporting that the issue cannot be reproduced anymore. Will close this report, assuming the bug has been fixed. Should the problem arise again, either this bug report can be reopened or a new one can be created (preferably).