Bug 479094 - kwin wayland randomly freezes completely when opening task switcher / alt tab
Summary: kwin wayland randomly freezes completely when opening task switcher / alt tab
Status: RESOLVED FIXED
Alias: None
Product: kwin
Classification: Plasma
Component: wayland-generic (other bugs)
Version First Reported In: 5.91.0
Platform: Arch Linux Linux
: NOR major
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords: qt6, wayland-only
Depends on:
Blocks:
 
Reported: 2023-12-27 19:46 UTC by Leia
Modified: 2024-02-21 20:27 UTC (History)
10 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments
Kwin log from start until it crashed (182.42 KB, text/plain)
2023-12-27 20:55 UTC, Leia
Details
journalctl of plasma-kwin_wayland (407.71 KB, text/x-log)
2023-12-28 12:10 UTC, Andrej Halveland
Details
System logs (953.24 KB, text/plain)
2024-01-17 13:53 UTC, Steve Cossette
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Leia 2023-12-27 19:46:08 UTC
STEPS TO REPRODUCE
1. Open the task switcher with alt + tab
2. Kwin can sometimes freezes completely

OBSERVED RESULT
Kwin freezes with graphical corruption (like random parts of each open window flickering on the screen) and eventually goes to a black screen

EXPECTED RESULT
Kwin should not freeze

SOFTWARE/OS VERSIONS
Operating System: Arch Linux 
KDE Plasma Version: 5.91.90
KDE Frameworks Version: 5.248.0
Qt Version: 6.6.1
Kernel Version: 6.6.8-arch1-1 (64-bit)
Graphics Platform: Wayland
Processors: 4 × Intel® Core™ i3-3240 CPU @ 3.40GHz
Graphics Processor: Mesa Intel® HD Graphics 2500 

ADDITIONAL INFORMATION
Can reproduce frequently (like every 15 minutes when using it normally) on beta 2 and git master built with kdesrc-build
daily drived beta 1 since it released until beta 2 and it never happened so must be a recent regression

When it happens it spams system logs with this until its killed / restarted:

kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawArrays
kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawArrays
kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawArrays
kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
Comment 1 Zamundaaa 2023-12-27 20:30:32 UTC
Please attach the full output of
> journalctl --user-unit plasma-kwin_wayland --boot 0
after reproducing the issue
Comment 2 Leia 2023-12-27 20:55:29 UTC
Created attachment 164490 [details]
Kwin log from start until it crashed
Comment 3 Leia 2023-12-27 20:58:54 UTC
(i said until crashed but i meant until i killed it on a tty with sigkill)
Comment 4 Zamundaaa 2023-12-27 22:57:54 UTC
I would've expected a GPU reset or *something,* but there's no clear trigger in the log. KWin creates and binds one VAO after creating the OpenGL context, so I can't think of any reason for why this would happen.
Comment 5 Andrej Halveland 2023-12-28 12:10:54 UTC
Created attachment 164508 [details]
journalctl of plasma-kwin_wayland

I just experienced something pretty similar and with the same errors in logs.
I was in the new overview, I closed a desktop and then quickly with the four-finger swipe down when to the desktop, but before it could get to the desktop, the screen started flickering, and when I tried going to another virtual desktop by using the gestures, the screen went black.

I reproduced it twice in a row, but as soon as I pulled out my camera I can't reproduce it... will try my best.

This is on Plasma 6 Beta 2.

SOFTWARE/OS VERSIONS
Operating System: Arch Linux
KDE Plasma Version: 5.91.0
KDE Frameworks Version: 5.247.0
Qt Version: 6.7.0 (beta)
Kernel Version: 6.7.0-rc6
Graphics Platform: Wayland
Graphics Drivers: mesa 23.3.1
Processors: AMD Ryzen 9 4900HS
Memory:  24GB of RAM
Graphics Processor: Integrated: Vega 8
Manufacturer: ASUS (Zephyrus G15 GA502IV)
Comment 6 Leia 2023-12-28 16:54:27 UTC
yeah i think its the same issue
i rarely use the overview so that's probably why i never reproduced it on the overview
Comment 7 hexchain 2024-01-04 04:02:08 UTC
The same issue happened to me twice: the first while moving the mouse pointer to the taskbar, and the other when I invoked the Tab Box.

System information:
Software: Plasma 5.91.0, Qt 6.7 beta 1
GPU: AMD Radeon 680M (Rembrandt)
EGL info:
OpenGL core profile vendor: AMD
OpenGL core profile renderer: AMD Radeon Graphics (radeonsi, rembrandt, LLVM 16.0.6, DRM 3.54, 6.6.9-arch1-1)
OpenGL core profile version: 4.6 (Core Profile) Mesa 23.3.2-arch1.2
OpenGL core profile shading language version: 4.60
Comment 8 Johannes Penßel 2024-01-17 10:43:10 UTC
Even though I used to hit this bug several times per day, it has not occured on my machine since I've rebuilt kwin on monday evening. I'm starting to suspect that some commit between ffb13a1c and 49ced6dc might have fixed this already. Can anyone confirm this? I'm building KDE off the branches targeting 6.0.

Operating System: Gentoo
KDE Plasma Version: 5.92.90
KDE Frameworks Version: 5.249.0
Qt Version: 6.6.1
Kernel Version: 6.7.0-gentoo (64-bit)
Graphics Platform: Wayland
Processors: 16 × 12th Gen Intel® Core™ i5-1240P
Graphics Processor: Mesa Intel® Graphics
Comment 9 Steve Cossette 2024-01-17 13:53:18 UTC
Created attachment 164979 [details]
System logs

System Logs (kwin_wayland filtered)
Comment 10 Steve Cossette 2024-01-17 13:54:31 UTC
I'm getting the same thing on RC1. Can't say if it was fixed in a future commit though as ours is built from the release binaries.
Comment 11 Leia 2024-01-17 14:14:28 UTC
I rebuilt KWin from master yesterday and also can't reproduce it anymore
Comment 12 Leia 2024-01-17 14:18:49 UTC
maybe https://invent.kde.org/plasma/kwin/-/commit/4f8c941bff507bf1e1baaf9ab13e141a8e61a316 also fixed this? it fixed a full crash on x11 when alt tabbing
Comment 13 Leia 2024-01-17 18:43:07 UTC
nvm it just happened again :(
Comment 14 Johannes Penßel 2024-01-17 20:19:47 UTC
Can confirm, shouldn't have jinxed it :( sorry for the noise.
Comment 15 Nate Graham 2024-01-19 05:21:11 UTC
Similar to Bug 479250.

Folks who are affected, does it go away if you switch to a different Task Switcher style, like "Large Icons" or "Sidebar"?
Comment 16 Oded Arbel 2024-01-25 00:03:16 UTC
I also experience frequent (daily) kwin_wayland complete freezes - though I'm unsure if it is related to this bug:
- I don't believe it is related to task switcher or overview (I frankly almost never use either)
- when kwin freezes, the main seat is completely dead - I can't even switch to a VT. I can only recover by SSHing to my machine and issuing kill -9.

I'm running Neon testing - up to date as of today - but the issue has been happening since RC1 dropped. I managed to get a thread dump of the frozen kwin, and I can see that the main thread (that pegs the CPU) is running this:

---8<---
#0  0x00007fb2e818445e in _mm_cmpeq_epi16(long long __vector(2), long long __vector(2)) (__B=..., __A=...) at ./src/corelib/text/qstring.cpp:599
#1  operator() (__closure=<synthetic pointer>, offset=0) at ./src/corelib/text/qstring.cpp:626
#2  ucstrncmp_sse2<(<unnamed>::StringComparisonMode)0, char16_t> (l=<optimized out>, b=0x559bd2085280 u"/usr/share/plasma/desktoptheme/default/widgets/scrollbar.svgz", a=<optimized out>) at ./src/corelib/text/qstring.cpp:634
#3  ucstrncmp<(<unnamed>::StringComparisonMode)0> (l=<optimized out>, b=0x559bd2085280 u"/usr/share/plasma/desktoptheme/default/widgets/scrollbar.svgz", a=<optimized out>) at ./src/corelib/text/qstring.cpp:1294
#4  ucstreq<char16_t> (blen=<optimized out>, b=0x559bd2085280 u"/usr/share/plasma/desktoptheme/default/widgets/scrollbar.svgz", alen=<optimized out>, a=<optimized out>) at ./src/corelib/text/qstring.cpp:1364
#5  QtPrivate::equalStrings(QStringView, QStringView) (lhs=..., rhs=...) at ./src/corelib/text/qstring.cpp:1402
#6  0x00007fb2eaf91844 in  () at /lib/x86_64-linux-gnu/libKF6Svg.so.6
#7  0x00007fb2eaf8634a in  () at /lib/x86_64-linux-gnu/libKF6Svg.so.6
#8  0x00007fb2eaf86bd9 in KSvg::Svg::hasElement(QStringView) const () at /lib/x86_64-linux-gnu/libKF6Svg.so.6
#9  0x00007fb2eaf7c4fd in  () at /lib/x86_64-linux-gnu/libKF6Svg.so.6
#10 0x00007fb2eaf7dc75 in  () at /lib/x86_64-linux-gnu/libKF6Svg.so.6
#11 0x00007fb2c13eb325 in  () at /usr/lib/x86_64-linux-gnu/qt6/qml/org/kde/ksvg/libcorebindingsplugin.so
#12 0x00007fb2ea36b051 in QQuickItem::setSize(QSizeF const&) (this=this@entry=0x559bd167ec60, size=...) at ./src/quick/items/qquickitem.cpp:7576
#13 0x00007fb2e1d9a8ae in QQuickScrollBarPrivate::resizeContent() (this=<optimized out>) at ./src/quicktemplates/qquickscrollbar.cpp:225
#14 0x00007fb2e1d9538e in QQuickScrollBar::setSize(double) (this=0x559bd1596c10, size=0.95266567015445935) at ./src/quicktemplates/qquickscrollbar.cpp:420
#15 0x00007fb2e8228a8b in doActivate<false>(QObject*, int, void**) (sender=0x559bcfe32210, signal_index=6, argv=0x7ffd868d4750) at ./src/corelib/kernel/qobject.cpp:4033
#16 0x00007fb2ea348e17 in QQuickFlickableVisibleArea::heightRatioChanged(double) (this=<optimized out>, _t1=<optimized out>) at ./obj-x86_64-linux-gnu/src/quick/Quick_autogen/include/moc_qquickflickable_p_p.cpp:304
#17 0x00007fb2ea12cb58 in QQuickItemViewPrivate::layout() (this=0x559bd2079640) at ./src/quick/items/qquickitemview.cpp:1927
#18 0x00007fb2ea30726b in QQuickWindowPrivate::polishItems() (this=0x559bd23f4230) at ./src/quick/items/qquickwindow.cpp:346
#19 0x00007fb2ea20a09a in QSGGuiThreadRenderLoop::renderWindow(QQuickWindow*) (this=0x559bd1ccb670, window=0x559bd329b830) at ./src/quick/scenegraph/qsgrenderloop.cpp:586
#20 0x00007fb2e87461a8 in QWindow::event(QEvent*) (this=0x559bd329b830, ev=0x7ffd868d4db0) at ./src/gui/kernel/qwindow.cpp:2553
#21 0x00007fb2e95f1b1b in QApplicationPrivate::notify_helper(QObject*, QEvent*) (this=<optimized out>, receiver=0x559bd329b830, e=0x7ffd868d4db0) at ./src/widgets/kernel/qapplication.cpp:3296
#22 0x00007fb2e825fe58 in QCoreApplication::notifyInternal2(QObject*, QEvent*) (receiver=receiver@entry=0x559bd329b830, event=event@entry=0x7ffd868d4db0) at ./src/corelib/kernel/qcoreapplication.cpp:1121
#23 0x00007fb2e826037d in QCoreApplication::sendSpontaneousEvent(QObject*, QEvent*) (receiver=receiver@entry=0x559bd329b830, event=event@entry=0x7ffd868d4db0) at ./src/corelib/kernel/qcoreapplication.cpp:1553
---8<---

Do you think this is related, or should I report a new bug?
Comment 17 hexchain 2024-01-25 05:06:53 UTC
I believe I can still switch VT when this issue happens, and almost every time it happens when kwin tries to show some window thumbnails (hence task switcher/overview).

Do you see a lot of these lines in the log when it's frozen? If yes then it could be related.

kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawArrays
Comment 18 hexchain 2024-01-25 05:08:54 UTC
(In reply to Nate Graham from comment #15)
> Folks who are affected, does it go away if you switch to a different Task
> Switcher style, like "Large Icons" or "Sidebar"?

I switched to "Sidebar" and it hasn't happened so far. But I'm also following git master closely so I'll try to switch back and see.
Comment 19 Oded Arbel 2024-01-25 07:59:40 UTC
(In reply to hexchain from comment #17)
> Do you see a lot of these lines in the log when it's frozen? If yes then it
> could be related.
> 
> kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glVertexAttribPointer(no
> array object bound)
> kwin_scene_opengl: 0x1: GL_INVALID_OPERATION in glDrawArrays

I do not see messages about GL_INVALID_OPERATION in my kwin logs, so I'll conclude that this isn't the same issue as I'm having.

I will wait to get a few more freezes and compare stack traces, and report a new bug as needed. Thank you.
Comment 20 Nate Graham 2024-01-26 22:17:04 UTC
Leia, does it happen with a switcher that isn't "Thumbnail grid"?

It sounds like we may have multiple issues, but we need to be precise about them so we don't end up with mega bug reports containing descriptions of multiple distinct issues.
Comment 21 Bug Janitor Service 2024-02-10 03:45:51 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 22 Leia 2024-02-13 11:01:56 UTC
sorry for no reply for some days, I didn't have my computer during that time, will rebuild master soon to test
Comment 23 Johannes Penßel 2024-02-18 16:02:22 UTC
I think I found a method to reproduce this issue reliably:

1. press Meta+W to open the overview
2. click on the search bar below the virtual desktop selection

This triggers the exact same behavior as outlined in the original bug description. Since my last comment a month ago, I have not once been able to reproduce this with the task switcher. (I'm still using the new default switcher) 

systemd-journal snippet:

kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glBindVertexArray(non-gen name)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glDrawArrays
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glDrawArrays
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glVertexAttribPointer(no array object bound)
kwin_wayland_wrapper[1417]: kwin_scene_opengl: 0x2: GL_INVALID_OPERATION in glDrawArrays
(...)

Linux 6.7.5-gentoo
Mesa 24.0.1 / Iris
Qt 6.6.2
KDE Frameworks master branch
Plasma 6.0 branch
Gear 24.02 branch
Comment 24 Vlad Zahorodnii 2024-02-19 14:25:33 UTC
I cannot reproduce the issue by following those steps
Comment 25 Johannes Penßel 2024-02-19 15:59:00 UTC
(In reply to Vlad Zahorodnii from comment #24)
> I cannot reproduce the issue by following those steps

I just tested a few things. For some weird reason, this seems to be a side effect of using KWIN_FORCE_SW_CURSOR=0 on intel, and unsetting it resolves it completely. I am not affected by the issues described in #474725, so I thought I could get away with using this. When KWIN_FORCE_SW_CURSOR=0 is set, this issue is also present in combination with swrast/llvmpipe. I cannot reproduce this at all on my box with AMD (RDNA2) graphics, but I wonder if KWIN_FORCE_SW_CURSOR=1 would have made a difference back when I used to encounter this issue on that machine as well. Going to try this with a plasma6-rc1 live image later.
Comment 26 Leia 2024-02-20 11:31:34 UTC
I can also confirm that without KWIN_FORCE_SW_CURSOR=0 on intel, the freeze doesn't happen
Comment 27 Zamundaaa 2024-02-21 18:01:06 UTC
I just managed to trigger this by clicking the search field in the overview effect. It began to close, then rendering broke.

The weird thing is, it starts with
> GL_INVALID_OPERATION in glBindVertexArray(non-gen name)

There's only one place where kwin_wayland uses that function, and it's called immediately after creating the VAO...
Comment 28 hexchain 2024-02-21 18:16:33 UTC
Clicking the search field in Overview triggers this for me as well - but with KWin git master (~1 day old) only, not 6.0 RC2.
Comment 29 Bug Janitor Service 2024-02-21 20:00:24 UTC
A possibly relevant merge request was started @ https://invent.kde.org/plasma/kwin/-/merge_requests/5265
Comment 30 Zamundaaa 2024-02-21 20:07:23 UTC
Git commit 8c3332f6195dcf213c481c28ddbff680f20a20a9 by Xaver Hugl.
Committed on 21/02/2024 at 19:58.
Pushed by zamundaaa into branch 'master'.

opengl/eglcontext: tell Qt when the OpenGL context gets changed

Otherwise, Qt thinks the old context is still current and will do things like
destroying VAOs with KWin's context, which ends up destroying the VAO of the
context and breaks rendering.

M  +4    -0    src/opengl/eglcontext.cpp

https://invent.kde.org/plasma/kwin/-/commit/8c3332f6195dcf213c481c28ddbff680f20a20a9
Comment 31 Zamundaaa 2024-02-21 20:27:37 UTC
Git commit 961a2d70417fe4cb9e053f1348e703fb611065f4 by Xaver Hugl.
Committed on 21/02/2024 at 20:16.
Pushed by zamundaaa into branch 'Plasma/6.0'.

opengl/eglcontext: tell Qt when the OpenGL context gets changed

Otherwise, Qt thinks the old context is still current and will do things like
destroying VAOs with KWin's context, which ends up destroying the VAO of the
context and breaks rendering.


(cherry picked from commit 8c3332f6195dcf213c481c28ddbff680f20a20a9)

M  +4    -0    src/opengl/eglcontext.cpp

https://invent.kde.org/plasma/kwin/-/commit/961a2d70417fe4cb9e053f1348e703fb611065f4