SUMMARY I just updated my system, which included a plasma-shell update, amongst many others. I can log in, all my apps start and work correctly, but the taskbar is missing, the desktop is unresponsive (e.g. context menu doesn't open at all) and the plasma-shell process is spinning at 100% CPU. SOFTWARE/OS VERSIONS Linux: Arch Linux, kernel 6.17.3-arch2-1 KDE Plasma Version: 6.4.5, running on X.Org X Server 1.21.1.18 KDE Frameworks Version: 6.19.0 Qt Version: 6.10.0 HW: ThinkPad P14s Gen 5 (AMD) 21ME GPU/drivers: AMD Radeon 780M Graphics (radeonsi, phoenix, LLVM 20.1.8, DRM 3.64, 6.17.3-arch2-1) ADDITIONAL INFORMATION The main thread of `/usr/bin/plasmashell --no-respawn` is spamming futex syscalls: futex(0x7feadc006e38, FUTEX_WAIT_BITSET, 2, NULL, FUTEX_BITSET_MATCH_ANY) = -1 EAGAIN (Resource temporarily unavailable) <0.000009> Stacktrace during the futex() call: #0 0x00007feaf1f1876d in syscall () from /usr/lib/libc.so.6 #1 0x00007feae1712b7b in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #2 0x00007feae1720163 in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #3 0x00007feae172138c in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #4 0x00007feae1ba986b in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #5 0x00007feae1baa108 in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #6 0x00007feae19664b2 in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #7 0x00007feae12dbc32 in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #8 0x00007feae12dd772 in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #9 0x00007feae123e2d0 in ?? () from /usr/lib/libgallium-25.2.4-arch1.2.so #10 0x00007feaea3d1a14 in ?? () from /usr/lib/libGLX_mesa.so.0 #11 0x00007feaea3d40a9 in ?? () from /usr/lib/libGLX_mesa.so.0 #12 0x00007feaeb8edb7e in ?? () from /usr/lib/qt6/plugins/xcbglintegrations/libqxcb-glx-integration.so #13 0x00007feaf339cf1e in QOpenGLContext::destroy() () from /usr/lib/libQt6Gui.so.6 #14 0x00007feaf339cf9f in QOpenGLContext::~QOpenGLContext() () from /usr/lib/libQt6Gui.so.6 #15 0x000055afc77f1d5b in vendorIsNVidia () at /usr/src/debug/plasma-workspace/plasma-workspace-6.4.5/shell/panelview.cpp:154 #16 operator() (__closure=<optimized out>) at /usr/src/debug/plasma-workspace/plasma-workspace-6.4.5/shell/panelview.cpp:168 #17 PanelView::isUnsupportedEnvironment (this=0x55aff9253b50) at /usr/src/debug/plasma-workspace/plasma-workspace-6.4.5/shell/panelview.cpp:169 #18 0x000055afc77f4e3e in PanelView::defaultFloating (this=0x55aff9253b50) at /usr/src/debug/plasma-workspace/plasma-workspace-6.4.5/shell/panelview.cpp:175 #19 PanelView::restore (this=0x55aff9253b50) at /usr/src/debug/plasma-workspace/plasma-workspace-6.4.5/shell/panelview.cpp:905 #20 0x000055afc786d5c8 in ?? () #21 0x0000000000000000 in ?? () The line where the faulty QOpenGLContext destructor is called is: 149 QOpenGLFunctions funcs(&context); 150 const QString vendor = QString::fromLocal8Bit(reinterpret_cast<const char *>(funcs.glGetString(GL_VENDOR))); 151 return vendor.contains(u"NVIDIA", Qt::CaseInsensitive); 152 } 153 return false; --> 154 } 155 I tried restarting plasma-shell, but it didn't help, it always hangs.
FYI: Unfortunately I'm unable to find symbols for libgallium, neither https://debuginfod.archlinux.org nor https://debuginfod.elfutils.org/ seem to have it.
Sounds bad, and also thanks for the excellent reporting and debugging done so far. However the vendorIsNVidia() code in panelview.cpp hasn't changed in over a year and a half. Here's what it's doing: static bool vendorIsNVidia() { QOffscreenSurface surface; surface.create(); [now it hangs for you] If something is breaking in here, I suspect an upstream graphics driver or Qt change has caused a regression. Out of curiosity, does the same thing happen in a Plasma Wayland session? For what it's worth, I'm using the same GPU and an Arch-based distro (KDE Linux) that provides the same kernel and driver versions, and don't see this issue on Wayland.
Thanks for a quick reply! Yes, the bug doesn't trigger on Wayland, but I'm not sure if that's meaningful for debugging - `vendorIsNvidia()` is not called at all when not running on Xorg: https://github.com/KDE/plasma-workspace/blob/v6.4.5/shell/panelview.cpp#L168. Unfortunately I'm forced to use Xorg, because VMware Workstation doesn't support Wayland (and a few other apps I use also struggle with it).
(In reply to Nate Graham from comment #2) > static bool vendorIsNVidia() > { > QOffscreenSurface surface; > surface.create(); > [now it hangs for you] btw. I think it actually hangs later, during the destructor call at function return (https://github.com/KDE/plasma-workspace/blob/v6.4.5/shell/panelview.cpp#L154), because it hangs in the QOpenGLContext destructor: 153 return false; --> 154 } 155
Can you attach a full backtrace? IN particular with the other threads.
Setting status, pending a reply
(In reply to David Redondo from comment #5) > Can you attach a full backtrace? IN particular with the other threads. Sure, although see below. Also, anything besides including other threads? I can't find symbols for libgallium (see above), so I'm not sure how to provide anything more, unless you are interested in assembly and register values at some specific points. Current status: I switched the breeze theme when debugging (didn't help), then switched to Wayland (helped, but the freezing code is x-only), then switched back to Xorg and this time it didn't hang. I'm using this opportunity to catch up with work while my desktop works again, after that I'll try to reproduce the hang again and will provide a full stacktrace. Maybe it doesn't always trigger, I'll try logging out an in multiple times and check whether starts to reproduce again.
I tried multiple times and (un)fortunately I'm not able to reproduce the hang anymore. I'd like to be able to provide more debug information when it happens again, how should I do that? especially, do you have any ideas how could I get the symbols for libgallium on Arch? It's missing from all the symbols servers I know.
Given that the issue was in the graphics drivers, and that you're on Arch which rapidly updates packages (including the graphics drivers), and that you can't reproduce it anymore, I think it's quite likely this was a driver bug that later got fixed. Thanks for your diligence here!