SUMMARY My screen sometimes freezes when starting an application in Plasma Wayland on an AMD system. The screen stops updating and the mouse cursor also no longer moves. KWin isn't dead, as it seems it's still reacting to input events (I have enabled the logging rule "kwin*.debug=true" to confirm this). I've encountered this issue after launching dolphin, okular, kwrite, thunderbird, wireshark, and systemsettings. These applications are all wayland-native, but this is just an observation, it might not be related at all. It doesn't seem to be a GPU hang as I can still switch between TTY when it happens, and at the same time, a different user session on another TTY is still responsive. Things I've tried but didn't help: - Setting KWIN_COMPOSITE=O2ES for kwin_wayland - Upgrading to mesa 22.2.0 (self-compiled without LTO because of Arch bug FS#76019) STEPS TO REPRODUCE 1. Start Plasma Wayland 2. Launch some applications (?) 3. ??? SOFTWARE/OS VERSIONS Operating System: Arch Linux KDE Plasma Version: 5.25.90 KDE Frameworks Version: 5.98.0 Qt Version: 5.15.6 Kernel Version: 5.19.11-zen1-1-zen (64-bit) Graphics Platform: Wayland Processors: 16 × AMD Ryzen 7 PRO 6860Z with Radeon Graphics Memory: 30.1 GiB of RAM Graphics Processor: AMD YELLOW_CARP mesa 22.1.7-1 qt5-base 5.15.6+kde+r177-1 qt5-wayland 5.15.6+kde+r50-1 ADDITIONAL INFORMATION Tried to kill a frozen kwin_wayland with SIGABRT, here is the stack: http://ix.io/4bHj
Is this a regression in the 5.26 beta?
I don't think so, I've seen the same on 5.25.
Are there any warnings or error messages in the KWin log or dmesg when KWin freezes?
I just tried to reproduce the issue from a fresh reboot. There was no useful message from dmesg, but kwin said something around the time the freeze happened: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE" I've searched through the journal, and found multiple appearances of this message. It doesn't seem to correlate with the freeze, though: Sep 16 14:45:47 kwin_wayland[534349]: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE" Sep 16 14:52:52 kwin_wayland[534349]: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE" Sep 16 14:53:19 kwin_wayland[534349]: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE" Sep 16 14:53:28 kwin_wayland[534349]: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE" Sep 16 15:10:53 kwin_wayland[534349]: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE" There was no such message before Sep 16 (I upgraded to 5.25.90 on Sep 15). > I've seen the same on 5.25. Thinking again, I have to confess I might be wrong here. It has been two weeks, and I really don't remember whether it has happened before.
I have a similar issue, but when the freeze happens I can't do anything except force shutdown (with the power button) and restart. It happens randomly but daily and only on wayland. I am also on an AMD setup, Arch Linux. I have also seen this message: kwin_wayland_drm: an error occurred while swapping buffers "EGL_BAD_SURFACE"
I do think this is a regression because I had used Plasma Wayland just fine in the past but I don't know where in the stack this issue comes from, whether it's a Kwin issue or kernel issue or something else.
(In reply to Matej Mrenica from comment #5) > I have a similar issue, but when the freeze happens I can't do anything > except force shutdown (with the power button) and restart. When it happens, does pressing Ctrl-Alt-F1~6 work? If you are able to SSH into the system when it freezes, does killing kwin_wayland bring you back to the login screen?
(In reply to hexchain from comment #7) > (In reply to Matej Mrenica from comment #5) > > I have a similar issue, but when the freeze happens I can't do anything > > except force shutdown (with the power button) and restart. > > When it happens, does pressing Ctrl-Alt-F1~6 work? If you are able to SSH > into the system when it freezes, does killing kwin_wayland bring you back to > the login screen? I will make sure to try Ctrl-Alt-F1~6 next time. Today I only tried Ctrl+Alt+Del, Ctrl+Alt+Backspace and Caps lock. Since I assume the caps lock led control is pretty low level, if that doesn't work there's no way any other keyboard input will.
I think I found a way to always reproduce this. Some steps may be optional. I am writing this after successfully reproducing this issue a couple times in a row. Even though this issue usually happens randomly, it also always happens after doing the following: 1. have 2 screens - laptop + external monitor is ok 2. when I log in, both my screens are on regardless of my settings (this is a know issue in Beta and has been fixed since then) 3. switch to external monitor (use Meta+P and select option 1) 4. switch to laptop screen (use Meta+P and select option 2) 5. switch to external monitor again (use Meta+P and select option 1) 6. start "Dolphin" and "System settings" very quickly (almost at the same time) - I have both pinned to my taskbar, I have no idea why Dolphin and System settings, any two apps might also work. 7. Your screen will freeze, if you have something playing in the background (a youtube video or something else) it will continue to play Optional: switch to another tty, log in, kill kwin_wayland, restart sddm and log back in (first log in will fail you have to try twice)
It probably has nothing to do with "launching" apps but is more related to creating a window. This issue just happened when I was trying to restore the qBitTorrent window from the tray area. The reason why it only freezes on certain types of windows still puzzles me. So far, in most cases, the cause is a Qt application, with the exception of Thunderbird.
Firstly, this issue is still present in Plasma 5.26.0 and since it wasn't here in Plasma 5.25.0 it should be considered a regression. Secondly, it could also be caused by Qt, frameworks or something else that was also updated in the same time period. Finally, when this issue occurs only the image is frozen but apps continue to operate normally, meaning that it is also possible to get out of this situation by blindly logging out (using keyboard for example).
Can you check dmesg for any messages of GPU resets once the screen froze once?
(In reply to Zamundaaa from comment #12) > Can you check dmesg for any messages of GPU resets once the screen froze > once? There aren't any kernel messages when this happens. I've seen GPU reset hangs but this is different.
Created attachment 152777 [details] dmesg I am not sure how that would look like so I am sending the entire dmesg. Looking through the log I didn't find anything interesting, but maybe you will.
Created attachment 152829 [details] journal This issue also seems to happen on only one screen at a time. I had an external monitor plugged in, and launching systemsettings caused the internal screen to freeze. I could still move my mouse cursor to the monitor and do something, but eventually trying to launch systemsettings again froze the monitor as well. I've attached the system and user journal during this entire process, annotated with my actions.
https://invent.kde.org/plasma/kwin/-/merge_requests/3063 should attempt to mitigate the problem. For the underlying issue, can you test whether disabling mesa_glthread like described in https://bugs.kde.org/show_bug.cgi?id=459558#c14 makes a difference?
(In reply to Zamundaaa from comment #16) > https://invent.kde.org/plasma/kwin/-/merge_requests/3063 should attempt to > mitigate the problem. > > For the underlying issue, can you test whether disabling mesa_glthread like > described in https://bugs.kde.org/show_bug.cgi?id=459558#c14 makes a > difference? I will try it.
So, I've tried it and already it's a lot worse.
Worse in what way?
(In reply to Zamundaaa from comment #19) > Worse in what way? Without it I was able to run Plasma for several hours before this issue shows up. After the suggested change I was able to run into this issue within seconds after logging in, three log ins in a row, before I changed it back. I can also try the kwin MR if it's working in it's current state.
*** Bug 459814 has been marked as a duplicate of this bug. ***
(In reply to Zamundaaa from comment #16) > https://invent.kde.org/plasma/kwin/-/merge_requests/3063 should attempt to > mitigate the problem. > > For the underlying issue, can you test whether disabling mesa_glthread like > described in https://bugs.kde.org/show_bug.cgi?id=459558#c14 makes a > difference? Is the MR meant to be used as a patch, or that branch must be taken altogether? Also, mesa_glthread is only enabled by default in 22.3, which is not yet released. I didn't tweak that parameter either.
I built the patch on Kwin 5.26 branch and couldn't reproduce the issue since.
I've applied the MR as a patch on top of 5.26.1. Now, whenever there is an EGL_BAD_SURFACE, KWin seems to reset itself, resulting in a minor noticeable stutter, but it no longer hangs like before.
What effect do you use to animate windows when opening them or closing them?
(In reply to Vlad Zahorodnii from comment #25) > What effect do you use to animate windows when opening them or closing them? The default one, "scale".
Does the screen freeze if you launch dolphin by pressing its shortcut (Meta+E) or clicking its icon in the task manager?
(In reply to Vlad Zahorodnii from comment #27) > Does the screen freeze if you launch dolphin by pressing its shortcut > (Meta+E) or clicking its icon in the task manager? For me, it was when I started Dolphin and System settings quickly together almost at the same time by clicking on the task bar. Currently I use the patch from above, so I don't have the issue.
(In reply to Vlad Zahorodnii from comment #25) > What effect do you use to animate windows when opening them or closing them? It's "Scale". (In reply to Vlad Zahorodnii from comment #27) > Does the screen freeze if you launch dolphin by pressing its shortcut > (Meta+E) or clicking its icon in the task manager? For now, I can't seem to trigger an EGL_BAD_SURFACE with Dolphin, but almost every time with Gwenview (either by double click opening a picture in Dolphin, launching through KRunner, or running from the terminal). Changing the "window open/close animation" doesn't affect this.
What decoration theme do you use?
(In reply to Vlad Zahorodnii from comment #30) > What decoration theme do you use? I am using Klassy (https://github.com/paulmcauley/klassy).
(In reply to hexchain from comment #31) > (In reply to Vlad Zahorodnii from comment #30) > > What decoration theme do you use? > > I am using Klassy (https://github.com/paulmcauley/klassy). Me too.
(In reply to Vlad Zahorodnii from comment #30) > What decoration theme do you use? I'm also encountering this bug and I'm also using Klassy as a window decoration theme, so this might be the reason why! I have moved back to the Breeze window decoration for now. I will report back whether I get freezes again.
See https://github.com/paulmcauley/klassy/issues/53#issuecomment-1279769977
*** Bug 460486 has been marked as a duplicate of this bug. ***
*** Bug 460938 has been marked as a duplicate of this bug. ***
(In reply to sdelang from comment #33) > I have moved back to the Breeze window decoration for now. I will report > back whether I get freezes again. Has it happened so far?
(In reply to Zamundaaa from comment #37) > (In reply to sdelang from comment #33) > > I have moved back to the Breeze window decoration for now. I will report > > back whether I get freezes again. > Has it happened so far? I forgot to report back :) Nope, perfectly hang free. I used to get several freezes a day with the Klassy window decoration.
(In reply to Zamundaaa from comment #37) > (In reply to sdelang from comment #33) > > I have moved back to the Breeze window decoration for now. I will report > > back whether I get freezes again. > Has it happened so far? I'm not the person you asked, but I gave wayland + klassy another shot after the 5.26.2 upgrade and so far I have not had the issue.
I've just tried enabling Klassy on 5.26.2 (but I also switched Mesa versions in the mean time -- still on a RX580 on Wayland). I am still getting hangs when using Klassy. FWIW I am under the impression that it is mostly launching Wayland-native apps that is a problem and that I haven't had a crashed caused by launching an X11 app. What happened was slightly weird: 1. My system was stable using Breeze as a window decoration today. 2. I switched to the Klassy decoration. 3. I could launch and close KeePassXC (Wayland-native) several times with no hang. Every time I launched it, one EGL_BAD_SURFACE error was logged. 4. I switched back to Breeze again. 5. Closing and reopening KeePassXC several times did not hang and did not cause any error in logs. 6. I switched back to Klassy again. 7. Closing and reopening KeePassXC *once* hung my main monitor I opened it in and logged an EGL_BAD_SURFACE error. Note that at that point the KeePassXC window is alive and well once I move it from the hung monitor. 8. Trying to get KWin to revive (changing the resolution on the main monitor in an attempt to recreate _something_) just froze the other monitor as well. Although it may be that the failed state broke KWin in some way when doing this, not that the bug reproduced independently on the 2nd monitor. In a second rerun, starting directly with Klassy, a hang occurred very early as I started Firefox and Konversation... And in a third run: 1. I started using Klassy, 2. I switched to Breeze, 3. I opened and closed KeePassXC several times, but got no hang or EGL_BAD_SURFACE error?! I'm not quite sure what's going on, but there seems to be some factor I cannot single out for reliable reproduction on my system, if it isn't just semi-random. From what I *think* I observed: - hangs seem to have *always* been happening at the same time as a EGL_BAD_SURFACE error... - ... but *some* EGL_BAD_SURFACE errors happen without a hang. - these EGL_BAD_SURFACE errors seem to always(?) be triggered when opening affected apps (i.e. apps that can sometimes cause a hang on launch).
Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone!
I don't know how the progress here is but I can confirm it is happening with me as well. I don't know how to backtrace it other than saying it is giving me the same EGL_BAD_SURFACE error on drm but I'm willing to help tracing if steps are given for me to follow.
(In reply to 31113 from comment #39) > (In reply to Zamundaaa from comment #37) > > (In reply to sdelang from comment #33) > > > I have moved back to the Breeze window decoration for now. I will report > > > back whether I get freezes again. > > Has it happened so far? > > I'm not the person you asked, but I gave wayland + klassy another shot after > the 5.26.2 upgrade and so far I have not had the issue. I was mistaken. The issue is still present in Operating System: Arch Linux KDE Plasma Version: 5.26.4 KDE Frameworks Version: 5.101.0 Qt Version: 5.15.7 Kernel Version: 6.0.12-arch1-1 (64-bit) Graphics Platform: Wayland
I can't reproduce it in 5.27 and klassy at commit 15bcf1f273c7070bed2acd49fae8cca26ad1d1bc If you're able to reproduce it, can you share your klassy configuration (~/.config/klassyrc) and klassy commit hash?
I will install the latest version of klassy and re-test this.
I am using the latest version in AUR, (is it up to date?) and I just had this issue.
(In reply to Matej Mrenica from comment #46) > I am using the latest version in AUR, (is it up to date?) and I just had > this issue. Regardless, I am now building the latest version from source and will report if the issue happens again.
(In reply to Matej Mrenica from comment #46) > I am using the latest version in AUR, (is it up to date?) and I just had > this issue. The proper way to get plasma beta releases on arch linux is to enable the kde-unstable repo (https://wiki.archlinux.org/title/Official_repositories#kde-unstable).
(In reply to 31113 from comment #48) > (In reply to Matej Mrenica from comment #46) > > I am using the latest version in AUR, (is it up to date?) and I just had > > this issue. > > The proper way to get plasma beta releases on arch linux is to enable the > kde-unstable repo > (https://wiki.archlinux.org/title/Official_repositories#kde-unstable). I meant the latest version of Klassy. Anyway, I just had a couple freezes with the latest version from git so until there is an update I am not using Klassy on Wayland.
Can confirm it still happens in KWin 5.26.0. Same EGL_BAD_SURFACE error, but this time C-A-F1~6 also stops working for about a minute, with the following warning in the journal appearing twice: kwin_wayland_drm: No drm events for gpu "/dev/dri/card0" within last 30 seconds
Can you share your klassy config?
(In reply to Vlad Zahorodnii from comment #51) > Can you share your klassy config? [Common] ActiveTitlebarOpacity=90 InactiveTitlebarOpacity=90 [Default Windeco Exception 0] BorderSize=1 Enabled=true ExceptionProgramNamePattern=kdenlive ExceptionWindowPropertyPattern=.*Kdenlive ExceptionWindowPropertyType=1 HideTitleBar=false Mask=0 OpaqueTitleBar=true PreventApplyOpacityToHeader=true
(In reply to Matej Mrenica from comment #52) > (In reply to Vlad Zahorodnii from comment #51) > > Can you share your klassy config? > > [Common] > ActiveTitlebarOpacity=90 > InactiveTitlebarOpacity=90 > > [Default Windeco Exception 0] > BorderSize=1 > Enabled=true > ExceptionProgramNamePattern=kdenlive > ExceptionWindowPropertyPattern=.*Kdenlive > ExceptionWindowPropertyType=1 > HideTitleBar=false > Mask=0 > OpaqueTitleBar=true > PreventApplyOpacityToHeader=true I applied your config but I'm still not able to reproduce the issue. My best theory is that klassy tries (or did try in the past) to change the shadow when kwin paints the decoration. With the current implementation, changing the shadow when kwin composes the output will produce EGL_BAD_SURFACE. On the other hand, the decoration should not mess with the shadow in Decoration::paint(). Not sure what we can do without being able to reproduce the bug, we don't have enough data to act on it.
(In reply to Vlad Zahorodnii from comment #53) > (In reply to Matej Mrenica from comment #52) > > (In reply to Vlad Zahorodnii from comment #51) > > > Can you share your klassy config? > > > > [Common] > > ActiveTitlebarOpacity=90 > > InactiveTitlebarOpacity=90 > > > > [Default Windeco Exception 0] > > BorderSize=1 > > Enabled=true > > ExceptionProgramNamePattern=kdenlive > > ExceptionWindowPropertyPattern=.*Kdenlive > > ExceptionWindowPropertyType=1 > > HideTitleBar=false > > Mask=0 > > OpaqueTitleBar=true > > PreventApplyOpacityToHeader=true > > I applied your config but I'm still not able to reproduce the issue. > > My best theory is that klassy tries (or did try in the past) to change the > shadow when kwin paints the decoration. With the current implementation, > changing the shadow when kwin composes the output will produce > EGL_BAD_SURFACE. On the other hand, the decoration should not mess with the > shadow in Decoration::paint(). > > Not sure what we can do without being able to reproduce the bug, we don't > have enough data to act on it. I have a similar configuration and it also doesn't happen all the time with me. It may or not happen. But it is more often that it does happen and result in the EGL_BAD_SURFACE error and kwin crashes. I can say I am using 5.27 Beta from openSUSE's KDE Repos. Here's what you can do to possibly increase the chances of it happening: - Apply the config; - Open the program in question; - If it doesn't happen, restart KWin or restart session; The way I understand it, if it is immediately applied and the program is opened, it may not crash with that error. But if it is fresh or it hasn't been opened for a short while, it may happen. I still don't know what causes it, only that setting titlebar as opaque for that specific application when general config is as transparent, will increase the chances. Only other case it happens is with Inkscape. I didn't apply that opaque titlebar configuration and it still happens like to the others.
I can confirm this on 5.27.0 .
Just built plasma from source today and cannot reproduce the crash in the linked klassy issue, could someone else try the same? This is the configuration I'm using, konsole, firefox and conky are excluded, any other program has a high chance of crashing kwin 5.27.1 (like always at first try) but not when buillt from master, if not completely fixed, at least there seems to be some mitigation ```ini [Common] ActiveTitlebarOpacity=80 AlwaysShowIconHighlightUsing=AlwaysShowIconHighlightUsingBackground ApplyOpacityToHeader=false BackgroundColors=ColorsAccentWithTrafficLights ButtonShape=ShapeSmallCircle InactiveTitlebarOpacity=80 ShadowSize=ShadowSmall TranslucentButtonBackgrounds=false [Default Windeco Exception 0] BorderSize=1 Enabled=true ExceptionProgramNamePattern=kdenlive ExceptionWindowPropertyPattern=.*Kdenlive ExceptionWindowPropertyType=1 HideTitleBar=false Mask=0 OpaqueTitleBar=true PreventApplyOpacityToHeader=true [Windeco] BackgroundOpacity=85 BoldButtonIcons=BoldIconsBold ButtonSpacingLeft=2 ButtonSpacingRight=2 ButtonStyle=sbeSierra CornerRadius=5 DrawTitleBarSeparator=false LockButtonSpacingLeftRight=true LockTitleBarLeftRightMargins=false OpaqueMaximizedTitlebars=false OpaqueTitleBar=false ThinWindowOutlineCustomColor=#3d4556 ThinWindowOutlineStyle=WindowOutlineCustomColor TitleAlignment=AlignCenter TitlebarBottomMargin=1.75 TitlebarLeftMargin=1 TitlebarRightMargin=2 TitlebarTopMargin=1.75 [Windeco Exception 0] BorderSize=1 Enabled=true ExceptionProgramNamePattern= ExceptionWindowPropertyPattern=conky \\(archlinux\\) ExceptionWindowPropertyType=1 HideTitleBar=true Mask=0 OpaqueTitleBar=false PreventApplyOpacityToHeader=false [Windeco Exception 1] BorderSize=1 Enabled=true ExceptionProgramNamePattern= ExceptionWindowPropertyPattern=^(?!.*(Konsole|conky \\(archlinux\\)|Firefox)) ExceptionWindowPropertyType=1 HideTitleBar=false Mask=0 OpaqueTitleBar=true PreventApplyOpacityToHeader=false ```
Can confirm on plasma 5.27.4
I did a fresh clean install of my OS to make sure there aren't any conflicts regarding packages and then compiled Klassy. Here are my results: I'm on an Intel machine, so no AMD anywhere. Plasma Wayland, with Klassy on Widget Style and Window Decoration. Titlebar opacity is on 85% and the option for toolbars to have the same opacity. This works for apps like Dolphin, Kate, Gwenview and others which use this type of widgets. For others, like System Monitor, Discover and other apps that make use of Kirigami, they don't make use of Widget Styles like these, so the toolbar is not transparent, so in those cases, I add a special Window-Specific Override to make the titlebar for those apps, completely opaque. At first, it works as intended. But when I close the app and open again, it causes KWin to freeze with the EGL_BAD_SURFACE thing. There's no way around it other than remove the override entirely. I don't have any special graphics card on this laptop, only the integrated UHD 620 from Intel. I didn't install any special drivers either. It's only what came with the kernel and the modules installed by openSUSE Tumbleweed by default.
(In reply to Vlad Zahorodnii from comment #53) > My best theory is that klassy tries (or did try in the past) to change the > shadow when kwin paints the decoration. With the current implementation, > changing the shadow when kwin composes the output will produce > EGL_BAD_SURFACE. On the other hand, the decoration should not mess with the > shadow in Decoration::paint(). > > Not sure what we can do without being able to reproduce the bug, we don't > have enough data to act on it. Hi Vlad, Klassy developer here. After much investigation, I can confirm that your theory is correct. Klassy has the option of coloured window outlines that are drawn as part of the shadow; these colours depend on the system colour scheme. Therefore, Klassy triggers an update of the shadow should the system colour scheme change by connecting the KDecoration2::DecoratedClient::paletteChanged signal. This is how the shadow can be changed when the paint() function is executing (the shadow is not "messed with in Decoration::paint()" directly). The EGL_BAD_SURFACE segfault only occurs with Klassy 4.0 from Plasma 5.26 beta onwards (did not occur in Plasma 5.25), and especially seems to be triggered when launching applications which have their own custom colour scheme options. As a temporary workaround, today I released Klassy 4.1 which seems to workaround this in the majority of cases, but I'm not sure how robust a fix it is (https://github.com/paulmcauley/klassy/commit/972edd08184b3a416b166053a1b1a3d042d33d92). I set a m_painting flag at the start of the decoration's paint() function and clear it at the end of paint() to try and detect when paint() is running (is there a better way to detect from the decoration when kwin is composing?). I then abort any attempt to change the shadow should the m_painting flag be set. I had also experimented with implementing delays instead of aborting, but this did not work as I don't think my detection of the kwin composing is accurate enough. Would it be possible to implement something more elegant/robust within kwin so that a call of setShadow() and paint() at the same time will not cause an EGL_BAD_SURFACE to occur e.g. delay shadow rendering until after paint()? Or even have that the segfault does not occur as in Plasma 5.25? Paul
KWin doesn't use egl surfaces anymore, so this error can't happen anymore