Bug 460513

Summary: Plasmashell crashes on Wayland
Product: [Plasma] plasmashell Reporter: space9301
Component: generic-crashAssignee: Plasma Bugs List <plasma-bugs>
Status: RESOLVED DUPLICATE    
Severity: crash CC: dion, jason, kdebugs.20.orzelf, nate, postix
Priority: NOR    
Version: 5.26.0   
Target Milestone: 1.0   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: Output of journalctl /usr/bin/kwin_wayland, including crashes
Output of journalctl /usr/bin/plasmashell, including crashes

Description space9301 2022-10-16 01:11:02 UTC
Created attachment 152873 [details]
Output of journalctl /usr/bin/kwin_wayland, including crashes

SUMMARY
***
NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols.
See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports
***


STEPS TO REPRODUCE
1. Unknown  (played some steam games, ~a few hours before this happened)
2.
3. 

OBSERVED RESULT
Plasma shell crashes and fails to reload.
The first time, it was able to remain open as long as I didn't use the application launcher. I relaunched manually with kstart plasmashell. Then, eventually, it failed to work at all, just giving 
```Omitting both --window and --windowclass arguments is not recommended
The Wayland connection broke. Did the Wayland compositor die?```

(tangent/another annoyance: the application launcher seems to remain open whenever I launch anything, needing to be manually closed)

I then killed kwin_wayland. It restarted automatically.
The second time, when plasmashell crashed, it relaunched automatically, quickly dying.

In the logs, `journalctl /usr/bin/kwin_wayland`,
```Oct 16 01:05:02 archlinux kwin_wayland_wrapper[1962]: Data too big for buffer (4108 > 4096).
Oct 16 01:05:02 archlinux kwin_wayland_wrapper[1962]: error in client communication (pid 66420)
``` is written at the time of the crash.

EXPECTED RESULT
Plasmashell doesn't die.
And, if it does die, I can restart it manually as on X11.
This is extremely frustrating as I have to restart my entire session to get the taskbar/launcher back.

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
This is my first bug report. It will be incomplete/imperfect due to the difficulty of writing anything. I don't know how to attach a backtrace yet.
While writing, I also seem to have lost alt-tab functionality.
Attaching the output of `journalctl /usr/bin/kwin_wayland`. I also have the output for plasmashell, but I can't seem to upload two attachments.
Comment 1 space9301 2022-10-16 01:12:29 UTC
Created attachment 152874 [details]
Output of journalctl /usr/bin/plasmashell, including crashes
Comment 2 space9301 2022-10-16 02:44:25 UTC
for completeness/readability:

Operating System: Arch Linux
KDE Plasma Version: 5.26.0
KDE Frameworks Version: 5.99.0
Qt Version: 5.15.6
Kernel Version: 6.0.1-arch2-1 (64-bit)
Graphics Platform: Wayland
Comment 3 Jason Playne 2022-10-16 11:50:42 UTC
quick question - when kwin_wayland crashes, does it just leave you with a black screen + movable cursor?
Comment 4 space9301 2022-10-16 18:18:15 UTC
(In reply to Jason Playne from comment #3)
> quick question - when kwin_wayland crashes, does it just leave you with a
> black screen + movable cursor?

To clarify, kwin_wayland didn't crash - I killed it manually.
But yes, I just got a black screen + movable cursor. (testing this now after killing plasmashell, I get the same)
Comment 5 space9301 2022-10-17 00:09:57 UTC
I managed to generate a backtrace by recompiling the plasma-workspace package with options=(debug !strip), and using
gdb -p $(pidof plasmashell) --command=debug-script

debug-script:
```set logging file plasmashell.gdb
set logging enabled

break checkWaylandError(wl_display*)
continue
# backtrace
bt

# generate coredump
gcore
```

(I identified the source of the message "The Wayland connection broke. Did the Wayland compositor die?" with `strings --print-file-name *.so* | grep -i 'compositor die?'`, traced it to libQt5WaylandClient.so, and downloaded+searched the source for that string. 
The function is https://github.com/qt/qtwayland/blob/2303ee38ead0a4eafa9f8af629fd8495c45d1442/src/client/qwaylanddisplay.cpp#L66
static void checkWaylandError(struct wl_display *display))


Thread 1 "plasmashell" hit Breakpoint 1, checkWaylandError (display=0x55b3ab934160) at /usr/src/debug/qtwayland/src/client/qwaylanddisplay.cpp:91
91      {
#0  checkWaylandError(wl_display*) (display=0x55b3ab934160) at /usr/src/debug/qtwayland/src/client/qwaylanddisplay.cpp:91
#1  0x00007fb211122b7c in QtWaylandClient::EventThread::readAndDispatchEvents() (this=<optimized out>)
    at /usr/src/debug/qtwayland/src/client/qwaylanddisplay.cpp:141
#2  QtWaylandClient::QWaylandDisplay::flushRequests() (this=<optimized out>) at /usr/src/debug/qtwayland/src/client/qwaylanddisplay.cpp:419
#3  0x00007fb2106b0520 in QObject::event(QEvent*) () at /usr/lib/libQt5Core.so.5
#4  0x00007fb211378b1c in QApplicationPrivate::notify_helper(QObject*, QEvent*) () at /usr/lib/libQt5Widgets.so.5
#5  0x00007fb21068cb88 in QCoreApplication::notifyInternal2(QObject*, QEvent*) () at /usr/lib/libQt5Core.so.5
#6  0x00007fb21068d693 in QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) () at /usr/lib/libQt5Core.so.5
#7  0x00007fb2106d3728 in  () at /usr/lib/libQt5Core.so.5
#8  0x00007fb20ed1981b in g_main_context_dispatch () at /usr/lib/libglib-2.0.so.0
#9  0x00007fb20ed6fec9 in  () at /usr/lib/libglib-2.0.so.0
#10 0x00007fb20ed180d2 in g_main_context_iteration () at /usr/lib/libglib-2.0.so.0
#11 0x00007fb2106d750c in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#12 0x00007fb21068532c in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) () at /usr/lib/libQt5Core.so.5
#13 0x00007fb21068fe59 in QCoreApplication::exec() () at /usr/lib/libQt5Core.so.5
#14 0x000055b3ab3521a3 in main(int, char**) (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/plasma-workspace-5.26.0/shell/main.cpp:233
warning: Memory read failed for corefile section, 4096 bytes at 0xffffffffff600000.
Saved corefile core.158040

I'm not sure how useful this is.

I also recompiled+installed the package qt5-wayland, which may explain the source file in the backtrace
#0  checkWaylandError(wl_display*) (display=0x55b3ab934160) at /usr/src/debug/qtwayland/src/client/qwaylanddisplay.cpp:91
instead of a binary
#3  0x00007fb2106b0520 in QObject::event(QEvent*) () at /usr/lib/libQt5Core.so.5

I don't know which way is preferred for debugging.
Comment 6 Jason Playne 2022-10-17 00:20:58 UTC
*** Bug 460336 has been marked as a duplicate of this bug. ***
Comment 7 space9301 2022-10-17 03:09:57 UTC
(In reply to Jason Playne from comment #6)
> *** Bug 460336 has been marked as a duplicate of this bug. ***

I'm not sure this is a duplicate. For me, my programs remained open -- only plasmashell crashed. They were only lost when I killed kwin_wayland. There were no segfaults in either kwin_wayland or plasmashell.

To reproduce the black screen with movable cursor:
- killall plasmashell    (the screen is now black, as the desktop background is handled by plasmashell, but all the programs are open. In the original bug, plasmashell failed to relaunch)
- ps $(pgrep kwin_wayland)
    PID TTY      STAT   TIME COMMAND
 220965 ?        Ssl    0:00 /usr/bin/kwin_wayland_wrapper --xwayland
 237394 ?        Sl     0:02 /usr/bin/kwin_wayland --wayland-fd 7 --socket wayland-0 --xwayland-fd 8 --xwayland-fd 9
- killall kwin_wayland    (the screen is black, and all the programs are gone, but I can still move my cursor and launch a terminal with ctrl-alt-t)
- ps $(pgrep kwin_wayland)
    PID TTY      STAT   TIME COMMAND
 220965 ?        Ssl    0:00 /usr/bin/kwin_wayland_wrapper --xwayland
 238586 ?        Sl     0:02 /usr/bin/kwin_wayland --wayland-fd 7 --socket wayland-0 --xwayland-fd 8 --xwayland-fd 9 
(kwin_wayland has automatically been restarted. Killing kwin_wayland_wrapper gets me to the login screen)
- kstart plasmashell   (I can now use my session again)

Sorry if the original bug description was misleading. "Output of journalctl /usr/bin/kwin_wayland, including crashes" should have been "including crashes of plasmashell".
Comment 8 Jason Playne 2022-10-17 06:22:55 UTC
(In reply to space9301 from comment #7)
> (In reply to Jason Playne from comment #6)
> > *** Bug 460336 has been marked as a duplicate of this bug. ***
> 
> I'm not sure this is a duplicate. For me, my programs remained open -- only
> plasmashell crashed. They were only lost when I killed kwin_wayland. There
> were no segfaults in either kwin_wayland or plasmashell.
> 
> To reproduce the black screen with movable cursor:
> - killall plasmashell    (the screen is now black, as the desktop background
> is handled by plasmashell, but all the programs are open. In the original
> bug, plasmashell failed to relaunch)
> - ps $(pgrep kwin_wayland)
>     PID TTY      STAT   TIME COMMAND
>  220965 ?        Ssl    0:00 /usr/bin/kwin_wayland_wrapper --xwayland
>  237394 ?        Sl     0:02 /usr/bin/kwin_wayland --wayland-fd 7 --socket
> wayland-0 --xwayland-fd 8 --xwayland-fd 9
> - killall kwin_wayland    (the screen is black, and all the programs are
> gone, but I can still move my cursor and launch a terminal with ctrl-alt-t)
> - ps $(pgrep kwin_wayland)
>     PID TTY      STAT   TIME COMMAND
>  220965 ?        Ssl    0:00 /usr/bin/kwin_wayland_wrapper --xwayland
>  238586 ?        Sl     0:02 /usr/bin/kwin_wayland --wayland-fd 7 --socket
> wayland-0 --xwayland-fd 8 --xwayland-fd 9 
> (kwin_wayland has automatically been restarted. Killing kwin_wayland_wrapper
> gets me to the login screen)
> - kstart plasmashell   (I can now use my session again)
> 
> Sorry if the original bug description was misleading. "Output of journalctl
> /usr/bin/kwin_wayland, including crashes" should have been "including
> crashes of plasmashell".

Your description describes what I am seeing (better than I am able to put into words)

The segfault's and back traces I have seen I am thinking are incidental.

I believe what I see is what you are describing.

I suddenly get a back screen with a movable cursor, kwin_wayland is still running - lots of complains from the apps about losing their connection to the wayland server.

I *think* the apps are still alive - I work around the problem with restarting sddm (the big hammer approach)
Comment 9 space9301 2022-10-17 08:28:47 UTC
Yes, it does look similar, though the cause is different. A few things:
- you reported a segfault in kwin_wayland, i.e. it tried to access invalid memory, then crashed. This is not incidental to what you experienced, it is the bug.
- kwin_wayland is not still running, it restarted. Note the different process IDs, 237394 vs 238586
- maybe other programs are still running after the crash, but I don't see them either. You would have to take a process tree before the crash with something like `ps axjf` and log to a file, then compare with after.

I'm happy to try to help you to reproduce your bug, though I do have limited time to spend on this now. I'll move further discussion over there.
Comment 10 Jason Playne 2022-10-24 02:23:21 UTC
Adding in some notes (to see if my usage patterns jive with yours)

I have 3 virtual desktops. When I primarily use just one of them (#3) the issue does not manifest

When I am regularly switching between each the problem manifests (but not on change)

On each desktop I have
#1 Java XWayland (jetbrains ide) + Opera Web Browser (Wayland)
#2 Vivaldi (Wayland) and Teams (X11, Electron)
#3 Discord (X11) and Firefox (X11)

I am wondering if this might be a Chromium/Wayland caused thing?
Comment 11 postix 2024-05-27 12:45:10 UTC
Making as dup as of 
> I identified the source of the message "The Wayland connection broke. Did the Wayland compositor die?"

*** This bug has been marked as a duplicate of bug 392376 ***