Bug 461874

Summary: Kwin_x11 randomly dies in KWin::Application::OperationMode
Product: [Plasma] kwin Reporter: Mal Haak <insanemal>
Component: generalAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED WORKSFORME    
Severity: crash CC: nate, sigxcpu
Priority: NOR    
Version: 5.26.3   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Mal Haak 2022-11-15 15:11:47 UTC
SUMMARY
***
After a recent update (Possibly the 5.26 release) Kwin_x11 has started crashing after an indeterminate length of time.
Attempts to restart kwin_x11 fail with file not found. 
***


STEPS TO REPRODUCE
1. Log in
2. Use machine as normal

OBSERVED RESULT
kwin_x11 will die

EXPECTED RESULT
kwin_x11 does not die

SOFTWARE/OS VERSIONS
Operating System: Arch Linux
KDE Plasma Version: 5.26.3
KDE Frameworks Version: 5.99.0
Qt Version: 5.15.7
Kernel Version: 6.0.8-arch1-1 (64-bit)
Graphics Platform: X11

ADDITIONAL INFORMATION
Processors: 8 × 11th Gen Intel® Core™ i7-1165G7 @ 2.80GHz
Memory: 38.9 GiB of RAM
Graphics Processor: Mesa Intel® Xe Graphics
Manufacturer: LENOVO
Product Name: 20W400G4AU
System Version: ThinkPad T15 Gen 2i

I'll attach the log entries I think are related. I will also update the ticket with the exact errors kwin_x11 --replace& throws when it next happens (which at this point is a daily, or sometimes more frequent, event.

Nov 15 11:29:54 mallenovo kwin_x11[1021]: kwin_core: XCB error: 152 (BadDamage), sequence: 58189, resource id: 17407832, major code: 143 (DAMAGE), minor code: 3 (Subtract)
Nov 15 11:30:09 mallenovo kwin_x11[1021]: kwin_core: XCB error: 152 (BadDamage), sequence: 1058, resource id: 17408701, major code: 143 (DAMAGE), minor code: 3 (Subtract)
Nov 15 11:31:29 mallenovo kded5[1020]: Service  ":1.476" unregistered
Nov 15 11:31:57 mallenovo kwin_x11[1021]: kwin_core: XCB error: 152 (BadDamage), sequence: 43279, resource id: 17413666, major code: 143 (DAMAGE), minor code: 3 (Subtract)
Nov 15 11:34:13 mallenovo kwin_x11[1021]: kwin_core: XCB error: 152 (BadDamage), sequence: 59254, resource id: 17415424, major code: 143 (DAMAGE), minor code: 3 (Subtract)
Nov 15 11:34:54 mallenovo kwin_x11[1021]: The X11 connection broke: I/O error (code 1)
Nov 15 11:34:54 mallenovo kwin_x11[1021]: XIO:  fatal IO error 22 (Invalid argument) on X server ":0"
Nov 15 11:34:54 mallenovo kwin_x11[1021]:       after 5961990 requests (5961990 known processed) with 0 events remaining.
Nov 15 11:34:54 mallenovo kwin_x11[1021]: file:///usr/share/kwin/outline/plasma/outline.qml:14: TypeError: Cannot read property 'longDuration' of null
Comment 1 Nate Graham 2022-11-15 19:04:59 UTC
> After a recent update (Possibly the 5.26 release) Kwin_x11 has started crashing after an indeterminate length of time.
Please provide the backtrace of the crash. See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports.

> Attempts to restart kwin_x11 fail with file not found. 
Please paste terminal output when you try to do this.

> Nov 15 11:34:54 mallenovo kwin_x11[1021]: The X11 connection broke: I/O error (code 1)
> Nov 15 11:34:54 mallenovo kwin_x11[1021]: XIO:  fatal IO error 22 (Invalid argument) on X server ":0"
This very much looks like an issue with the X server itself...
Comment 2 Vlad Zahorodnii 2022-11-16 09:49:44 UTC
Nov 15 11:29:54 mallenovo kwin_x11[1021]: kwin_core: XCB error: 152 (BadDamage), sequence: 58189, resource id: 17407832, major code: 143 (DAMAGE), minor code: 3 (Subtract)

Do you know what makes kwin_x11 prints many these warnings? e.g. did you close a window or resized, etc? You shouldn't see this warning in general
Comment 3 Mal Haak 2022-11-16 13:05:33 UTC
So it's just started to play up, it hasn't crashed yet but I'm getting lots of these BadDamage errors like this one I previously posted:

Nov 15 11:29:54 mallenovo kwin_x11[1021]: kwin_core: XCB error: 152 (BadDamage), sequence: 58189, resource id: 17407832, major code: 143 (DAMAGE), minor code: 3 (Subtract)

Basically when I move windows at the moment I get that old school Windows 95 blur effect happening over windows behind the one I'm moving. It fixes it self when I stop moving windows. 

It could be an X issue, but I'm starting to think it might actually be an issue with the Intel XE driver stack as I have NVIDIA based machines running the same versions of everything else and they appear unaffected.
Comment 4 Mal Haak 2022-11-20 14:27:19 UTC
The X11 connection broke: I/O error (code 1)
XIO:  fatal IO error 2 (No such file or directory) on X server ":0"
      after 5184 requests (5184 known processed) with 0 events remaining.

And I think I found the issue... 

[ 1461.860945] Out of memory: Killed process 829 (kwin_x11) total-vm:54533932kB, anon-rss:449464kB, file-rss:0kB, shmem-rss:4kB, UID:1000 pgtables:102528>

I think kwin_x11 has a memory leak.  But I think sometimes it kills X.
Comment 5 Alexandru Pirvulescu 2022-11-22 15:42:48 UTC
I can reproduce it every time I zoom my desktop (Desktop Zoom effect). I zooms and crashes in 1-2 seconds, taking Xorg along with it. I am running KDE Neon (on Ubuntu 22.04).

Build with issue: 4:5.26.3-0xneon+22.04+jammy+release+build21
Build without issue:  4:5.26.3-0xneon+22.04+jammy+release+build20

Stacktrace of kwin_x11 coredump:

                                              Stack trace of thread 38156:
                                                 #0  0x00007f987aa96a7c __pthread_kill_implementation (libc.so.6 + 0x96a7c)
                                                 #1  0x00007f987aa42476 __GI_raise (libc.so.6 + 0x42476)
                                                 #2  0x00007f987aa287f3 __GI_abort (libc.so.6 + 0x287f3)
                                                 #3  0x00007f987bc91ba3 qt_message_fatal (libQt5Core.so.5 + 0x91ba3)
                                                 #4  0x00007f987c339e23 _ZN22QGuiApplicationPrivate25createPlatformIntegrationEv (libQt5Gui.so.5 + 0x139e23)
                                                 #5  0x00007f987c33a308 _ZN22QGuiApplicationPrivate21createEventDispatcherEv (libQt5Gui.so.5 + 0x13a308)
                                                 #6  0x00007f987bec2d77 _ZN23QCoreApplicationPrivate4initEv (libQt5Core.so.5 + 0x2c2d77)
                                                 #7  0x00007f987c33d270 _ZN22QGuiApplicationPrivate4initEv (libQt5Gui.so.5 + 0x13d270)
                                                 #8  0x00007f987b371d2d _ZN19QApplicationPrivate4initEv (libQt5Widgets.so.5 + 0x171d2d)
                                                 #9  0x00007f987d82c9b6 _ZN4KWin11ApplicationC1ENS0_13OperationModeERiPPc (libkwin.so.5 + 0x22c9b6)
                                                 #10 0x000055e9861d22be n/a (kwin_x11 + 0x462be)
                                                 #11 0x00007f987aa29d90 __libc_start_call_main (libc.so.6 + 0x29d90)
                                                 #12 0x00007f987aa29e40 __libc_start_main_impl (libc.so.6 + 0x29e40)
                                                 #13 0x000055e9861d3ae5 n/a (kwin_x11 + 0x47ae5)
                                                 
                                                 Stack trace of thread 38203:
                                                 #0  0x00007f987ab18d7f __GI___poll (libc.so.6 + 0x118d7f)
                                                 #1  0x00007f98796a6696 n/a (libglib-2.0.so.0 + 0xaa696)
                                                 #2  0x00007f987964f3c3 g_main_context_iteration (libglib-2.0.so.0 + 0x533c3)
                                                 #3  0x00007f987bf15af8 _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5Core.so.5 + 0x315af8)
                                                 #4  0x00007f987beba9bb _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt5Core.so.5 + 0x2ba9bb)
                                                 #5  0x00007f987bccd4e2 _ZN7QThread4execEv (libQt5Core.so.5 + 0xcd4e2)
                                                 #6  0x00007f987b185f1b n/a (libQt5DBus.so.5 + 0x18f1b)
                                                 #7  0x00007f987bcce703 _ZN14QThreadPrivate5startEPv (libQt5Core.so.5 + 0xce703)
                                                 #8  0x00007f987aa94b43 start_thread (libc.so.6 + 0x94b43)
                                                 #9  0x00007f987ab26a00 __clone3 (libc.so.6 + 0x126a00)

Stacktrace of Xorg coredump:

                                                 Stack trace of thread 3234:
                                                 #0  0x00007f10bf496a7c __pthread_kill_implementation (libc.so.6 + 0x96a7c)
                                                 #1  0x00007f10bf442476 __GI_raise (libc.so.6 + 0x42476)
                                                 #2  0x00007f10bf4287f3 __GI_abort (libc.so.6 + 0x287f3)
                                                 #3  0x0000560c8d6a6340 OsAbort (Xorg + 0x1d9340)
                                                 #4  0x0000560c8d6abb49 n/a (Xorg + 0x1deb49)
                                                 #5  0x0000560c8d6acb3a FatalError (Xorg + 0x1dfb3a)
                                                 #6  0x0000560c8d6a374d n/a (Xorg + 0x1d674d)
                                                 #7  0x00007f10bf442520 __restore_rt (libc.so.6 + 0x42520)
                                                 #8  0x0000560c8d5381e9 n/a (Xorg + 0x6b1e9)
                                                 #9  0x0000560c8d539c85 n/a (Xorg + 0x6cc85)
                                                 #10 0x0000560c8d53a183 n/a (Xorg + 0x6d183)
                                                 #11 0x0000560c8d53b183 n/a (Xorg + 0x6e183)
                                                 #12 0x0000560c8d619a7d n/a (Xorg + 0x14ca7d)
                                                 #13 0x0000560c8d643e4c XkbHandleActions (Xorg + 0x176e4c)
                                                 #14 0x0000560c8d63cf21 n/a (Xorg + 0x16ff21)
                                                 #15 0x0000560c8d63d11e n/a (Xorg + 0x17011e)
                                                 #16 0x0000560c8d69cc60 n/a (Xorg + 0x1cfc60)
                                                 #17 0x0000560c8d69cee8 WaitForSomething (Xorg + 0x1cfee8)
                                                 #18 0x0000560c8d52d257 n/a (Xorg + 0x60257)
                                                 #19 0x0000560c8d531524 n/a (Xorg + 0x64524)
                                                 #20 0x00007f10bf429d90 __libc_start_call_main (libc.so.6 + 0x29d90)
                                                 #21 0x00007f10bf429e40 __libc_start_main_impl (libc.so.6 + 0x29e40)
                                                 #22 0x0000560c8d51a5f5 _start (Xorg + 0x4d5f5)
Comment 6 Alexandru Pirvulescu 2022-11-22 15:44:12 UTC
Added another stacktrace for kwin_x11, looks different:

                                                 Stack trace of thread 4851:
                                                 #0  0x00007faf91a96a7c __pthread_kill_implementation (libc.so.6 + 0x96a7c)
                                                 #1  0x00007faf91a42476 __GI_raise (libc.so.6 + 0x42476)
                                                 #2  0x00007faf94df30a5 _ZN6KCrash19defaultCrashHandlerEi (libKF5Crash.so.5 + 0x80a5)
                                                 #3  0x00007faf91a42520 __restore_rt (libc.so.6 + 0x42520)
                                                 #4  0x00007faf94803445 _ZN4KWin9Workspace14workspaceEventEP19xcb_generic_event_t (libkwin.so.5 + 0x203445)
                                                 #5  0x00007faf92eb9467 _ZN24QAbstractEventDispatcher17filterNativeEventERK10QByteArrayPvPl (libQt5Core.so.5 + 0x2b9467)
                                                 #6  0x00007faf8d2cc595 _ZN14QXcbConnection14handleXcbEventEP19xcb_generic_event_t (libQt5XcbQpa.so.5 + 0x4d595)
                                                 #7  0x00007faf8d2cdcd6 _ZN14QXcbConnection16processXcbEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5XcbQpa.so.5 + 0x4ecd6)
                                                 #8  0x00007faf8d2f4c87 n/a (libQt5XcbQpa.so.5 + 0x75c87)
                                                 #9  0x00007faf90720d1b g_main_context_dispatch (libglib-2.0.so.0 + 0x55d1b)
                                                 #10 0x00007faf907756f8 n/a (libglib-2.0.so.0 + 0xaa6f8)
                                                 #11 0x00007faf9071e3c3 g_main_context_iteration (libglib-2.0.so.0 + 0x533c3)
                                                 #12 0x00007faf92f15af8 _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5Core.so.5 + 0x315af8)
                                                 #13 0x00007faf92eba9bb _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt5Core.so.5 + 0x2ba9bb)
                                                 #14 0x00007faf92ec2f54 _ZN16QCoreApplication4execEv (libQt5Core.so.5 + 0x2c2f54)
                                                 #15 0x00005622b70f775b n/a (kwin_x11 + 0x4675b)
                                                 #16 0x00007faf91a29d90 __libc_start_call_main (libc.so.6 + 0x29d90)
                                                 #17 0x00007faf91a29e40 __libc_start_main_impl (libc.so.6 + 0x29e40)
                                                 #18 0x00005622b70f8ae5 n/a (kwin_x11 + 0x47ae5)
                                                 
                                                 Stack trace of thread 4860:
                                                 #0  0x00007faf91b18d7f __GI___poll (libc.so.6 + 0x118d7f)
                                                 #1  0x00007faf90775696 n/a (libglib-2.0.so.0 + 0xaa696)
                                                 #2  0x00007faf9071e3c3 g_main_context_iteration (libglib-2.0.so.0 + 0x533c3)
                                                 #3  0x00007faf92f15b6e _ZN20QEventDispatcherGlib13processEventsE6QFlagsIN10QEventLoop17ProcessEventsFlagEE (libQt5Core.so.5 + 0x315b6e)
                                                 #4  0x00007faf92eba9bb _ZN10QEventLoop4execE6QFlagsINS_17ProcessEventsFlagEE (libQt5Core.so.5 + 0x2ba9bb)
                                                 #5  0x00007faf92ccd4e2 _ZN7QThread4execEv (libQt5Core.so.5 + 0xcd4e2)
                                                 #6  0x00007faf92910f1b n/a (libQt5DBus.so.5 + 0x18f1b)
                                                 #7  0x00007faf92cce703 _ZN14QThreadPrivate5startEPv (libQt5Core.so.5 + 0xce703)
                                                 #8  0x00007faf91a94b43 start_thread (libc.so.6 + 0x94b43)
                                                 #9  0x00007faf91b26a00 __clone3 (libc.so.6 + 0x126a00)
Comment 7 Mal Haak 2022-11-23 09:10:26 UTC
I've also found that some web based STL viewers can trigger this. Only on my Intel GPU's. Same thing, kwin_x11 crashes, takes out X.

Sometimes I see a GPU HANG in dmesg, I've lodged a ticket against i915 for this as well, but it feels like kwin is issuing an unsupported *something* (draw or something, I believe some line thicknesses are an issue on intel, but I'm unsure what that means as I'm not a gpu coder)
Comment 8 Alexandru Pirvulescu 2022-12-02 21:32:45 UTC
For me, this seems to be solved in current Neon version: 4:5.26.4-0xneon+22.04+jammy+release+build22
Comment 9 Nate Graham 2022-12-02 21:49:03 UTC
Great! Is it the same for you, Mal?
Comment 10 Bug Janitor Service 2022-12-17 05:14:26 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 11 Bug Janitor Service 2023-01-01 05:21:00 UTC
This bug has been in NEEDSINFO status with no change for at least
30 days. The bug is now closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

Thank you for helping us make KDE software even better for everyone!