Bug 450301 - On KWin 5.24 X11 Nvidia, when reenabling compositing, windows begin flickering and flipping upside-down
Summary: On KWin 5.24 X11 Nvidia, when reenabling compositing, windows begin flickerin...
Status: REPORTED
Alias: None
Product: kwin
Classification: Plasma
Component: compositing (show other bugs)
Version: 5.24.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords: regression
: 469809 476473 (view as bug list)
Depends on:
Blocks:
 
Reported: 2022-02-15 13:14 UTC by nyanpasu64
Modified: 2023-11-06 20:23 UTC (History)
13 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Someone else's journalctl --user log at the time of the display corruption (14.86 KB, text/plain)
2022-02-15 13:14 UTC, nyanpasu64
Details
Video demonstrating the flickering (1.76 MB, video/mp4)
2022-02-15 13:57 UTC, Awakening
Details

Note You need to log in before you can comment on or make changes to this bug.
Description nyanpasu64 2022-02-15 13:14:57 UTC
Created attachment 146770 [details]
Someone else's journalctl --user log at the time of the display corruption

SUMMARY
After upgrading to Plasma 5.24, I occasionally (when exiting mpv fullscreen?) get corrupted kwin_x11 compositing, with flickering and notably upside-down windows, which persist until restarting kwin_x11.

NVidia corruption is a longstanding issue, but me and two other people have noticed much more common and qualitatively different symptoms (vertically flipped windows), which started on KWin 5.24 (and perhaps 5.23.90? IDK.)

STEPS TO REPRODUCE
1. Start a Plasma X11 session on proprietary NVidia drivers, with compositing enabled. (I'm using 470.103.01 drivers on an older GPU, other people have encountered this bug on NVidia driver 510.47.03, on both systemd and classic KDE initialization.)
2. Unsure. I think the bug started when the computer was unattended (but may be wrong). Two people said the bug started when exiting fullscreen mpv playback (I *think* fullscreen mpv turns off compositing, one of the two people used `x11-bypass-compositor=fs-only`).

I was unable to replicate this issue by toggling compositing rapidly, or by switching users (to SDDM) and unlocking my account again. When exiting mpv fullscreen, I got a temporarily hung display for 0.25 to 1 second (except for moving cursor) with transparent/corrupted title bars, but did not get persistent graphical corruption.

OBSERVED RESULT
Windows begin to flicker, and sometimes render upside down. Disabling compositing fixes rendering, but reenabling compositing causes the issue to return. Restarting kwin_x11 fixes the issue and compositing works again.

Checking my journalctl, I don't see any kwin_x11 warnings related to OpenGL. In one person's screenshot I saw a "kwin_core: XCB error: 10 (BadAccess)" around the time of the corruption, followed by many QXcbConnection errors clustered at the same time.

EXPECTED RESULT
kwin_x11 doesn't render improperly.

SOFTWARE/OS VERSIONS
Operating System: Arch Linux
KDE Plasma Version: 5.24.0
KDE Frameworks Version: 5.91.0
Qt Version: 5.15.2
Kernel Version: 5.16.9-zen1-1-zen (64-bit)
Graphics Platform: X11
Processors: 12 × AMD Ryzen 5 5600X 6-Core Processor
Memory: 15.6 GiB of RAM
Graphics Processor: NVIDIA GeForce GT 730/PCIe/SSE2 (drivers 470.103.01)

ADDITIONAL INFORMATION
Probably not related to Bug 450052 (a Wayland bug), but it's worth a look.
Comment 1 Awakening 2022-02-15 13:53:02 UTC
I'm one of those two people, ran into this twice so far, not sure what exactly triggered it either, flickering presumably started after toggling compositing back on (by leaving full screen in mpv, at least on one occasion, don't remember the other), session uptime was in >12h territory.

I can briefly trigger the upside-down window in bottom-left corner on black background by having mpv open, alt-tabbing (so the left-side panel with thumbnails shows up), and cycling compositing. 
Doesn't send it into uncontrollable flickering though.
Which reminds me of bug 443341, might be relevant.

SOFTWARE/OS VERSIONS
Operating System: Arch Linux
KDE Plasma Version: 5.24.0
KDE Frameworks Version: 5.90.0
Qt Version: 5.15.2
Kernel Version: 5.16.7-arch1-1 (64-bit)
Graphics Platform: X11
Processors: 12 × AMD Ryzen 5 3600 6-Core Processor
Memory: 31.3 GiB of RAM
Graphics Processor: NVIDIA GeForce GTX 1060 3GB/PCIe/SSE2
GPU driver version: 510.47.03
Comment 2 Awakening 2022-02-15 13:57:48 UTC
Created attachment 146771 [details]
Video demonstrating the flickering
Comment 3 nyanpasu64 2022-02-17 12:17:25 UTC
Just hit this bug again. When waking my computer, I got an upside-down foobar2000 on a black screen instead of the lock screen. When my mouse cursor moved in and out of the password text field, it changed to a text cursor, and *sometimes* (not always) the lock screen appeared for around 1-2 frames in my smartphone video recording (uploaded to https://youtu.be/90tbG_c3-cQ). The time between two adjacent appearances of the lock screen was around 24, 22, or 17 frames (not exactly matching the caret flash rate). IIRC typing the password didn't make the lock screen appear.

I checked my journal afterwards, and at the time I woke the machine, I saw "kwin_x11[293413]: OpenGL vendor string:" and "BlurConfig::instance called after the first use - ignoring", etc.. I restarted kwin, then tried sleeping and waking the machine again (without the bug occurring) and saw the same messages reappear in my journal, meaning they appear both with and without the bug occurring. I suspect KWin randomly corrupts when trying to restart compositing.

Running kwin_x11 under Valgrind and sleep-waking my system, I see a bunch of uninitialized memory reads/syscalls, such as "Invalid read of size 16". All have a stack trace for where they're allocated, but most lack a stack trace for where they're read (just two hex addresses with no symbol names, and no path to main()). There's no smoking gun, no obvious memory misuse, just a pile of errors from random libraries (including /usr/lib/libGLX_nvidia.so.470.103.01) which I don't understand. In any case I've posted the logs at https://gist.github.com/nyanpasu64/db0518c4f08569a39acb810d936fcd01.
Comment 4 nyanpasu64 2022-02-27 21:12:58 UTC
I tried to take a trace-cmd sample using https://github.com/mikesart/gpuvis/tree/master/sample and open it in gpuvis, but I only got CPU information. I can share my traces if anyone is interested.

Interestingly killing plasmashell (not kwin_x11) stopped the flickering.
Comment 5 nyanpasu64 2022-03-16 13:11:28 UTC
Just woke from sleep again, and plasmashell hung shortly afterwards, with a shadow of a notification visible (not the notification itself). I could move the mouse and type into Discord, but not trigger Konsole or krunner to launch a task manager and kill plasmashell.

I switched to a TTY, where htop indicated that several Electron apps and plasmashell were burning a CPU core each, and specifically a non-main thread in plasmashell was burning a CPU core. gdb showed that thread would not return from syncSceneGraph.

Not sure if plasmashell and electron burning CPU were caused by resuming from sleep, or switching to TTY. Nonetheless switching to TTY during normal operation does not cause either Electron or plasmashell to burn a full CPU core.

I didn't look into what Electron was doing wrong. I should've looked. However there was no good way to check whether Electron started burning CPU before or after switching to TTY, since I couldn't launch htop or SSH into my system before switching to TTY.
Comment 6 Just Anig 2022-07-23 16:40:36 UTC
This happens every other time I switch virtual terminals, and it's extremely aggravating since I have no idea how to make it go away.

PS: Have you tried kwinft?
Comment 7 Kott 2022-09-19 08:58:40 UTC
Got the same with my config. Mostly when the mpv is playing.

Kernel: 5.19.8-1-default arch: x86_64 bits: 64
Desktop: KDE Plasma v: 5.25.5 Distro: openSUSE Tumbleweed 20220915

Device-1: NVIDIA TU117 [GeForce GTX 1650] driver: nvidia v: 515.65.01
Display: x11 server: X.Org v: 21.1.4 with: Xwayland v: 22.1.3 driver: X:
loaded: nvidia unloaded: fbdev,modesetting,nouveau,vesa failed: nv
gpu: nvidia,nvidia-nvswitch resolution: 2560x1080~60Hz
OpenGL: renderer: NVIDIA GeForce GTX 1650/PCIe/SSE2 v: 4.6.0 NVIDIA 515.65.01
Comment 8 Vladislav Rubtsov 2022-10-02 18:03:09 UTC
I have exactly the same issue as seen on the video attache by Awakening. This is certainly a compositor problem as disabling compositor (Shift+Alt+F12) fixes it.

This is extremely annoying as I haven't found a reliable way to fix this (switching compositor on/off doesn't always help). It happens after wake up or exiting a game or other fullscreen application that temporarily disables compositing.

Operating System: Manjaro Linux
KDE Plasma Version: 5.25.5
KDE Frameworks Version: 5.97.0
Qt Version: 5.15.5
Kernel Version: 5.18.19-3-MANJARO (64-bit)
Graphics Platform: X11
Processors: 12 × Intel® Core™ i7-10750H CPU @ 2.60GHz
Memory: 31.3 GiB of RAM
Graphics Processor: NVIDIA GeForce RTX 2070 Super/PCIe/SSE2 (driver 515.65.01)
Manufacturer: Motherboard by ZOTAC
Product Name: ZBOX-QCM7T3000/EN072080S/EN072070S/EN052060C
System Version: Rev.00
Comment 9 Kott 2022-12-26 10:32:45 UTC
I think it's worth to mention that my Qt engine is Kvantum. I switch back to the Breeze and see no upside-down flipping for a while.
Comment 10 Daniel Neugebauer 2023-01-03 15:36:52 UTC
When I run X-Plane I use a wrapper script to disable and reenable compositing via dbus for better rendering performance. Since installing X-Plane 12 in September 2022 I noticed I also have a high chance to experience this issue when the application exits and I repeat that several times via my wrapper script (about once every 5 to 25 times, very sporadic). The issue is still present with kwin and Plasma workspace 5.25.5.

The screen turns mostly black. The Konsole terminal window I use to run the script gets duplicated and flipped vertically. Only the non-flipped copy has window translucency applied (I can see the desktop background image). Contents may or may not be completely visible. The image on screen is stable as long as I don't move the mouse or interact with any windows. When moving the mouse across the screen, parts (widgets?) of the windows currently hovered by the mouse cursor appear shortly but flicker/disappear as the mouse get moved further. The parent widget hierarchy seems to appear at random (partial window repaints?). Other desktops (I use a total of 8) did not seem to be affected in the corruption I encountered today (I don't remember if I checked that before).

Memtest86 has been run several times since I noticed these issues and MemtestG80 (for VRAM) has also been executed two times so far without any errors being detected. MCE log entries do not correlate to screen corruption.

I can confirm that disabling the compositor helps. Reenabling does not resolve the issue, a restart is required to fix it.

The commands used to toggle the compositor from my script are:

qdbus org.kde.KWin /Compositor suspend
qdbus org.kde.KWin /Compositor resume

There are no entries in system logs (metalog, I don't use systemd) related to kwin or otherwise correlating to the corruption.

Looking back at my package installation history since October (not sure if I already encountered it in September as well), the problem has been observed using:
- Gentoo Linux
- proprietary Nvidia drivers 515.65.01, 525.60.11 and 525.60.13
- X.org X11 server 21.1.4
- Mesa 22.1.7 and 22.2.3
- Qt 5.15.5 and 5.15.7
- kwin 5.25.5
- Plasma workspace 5.25.5

Hardware:
- NVIDIA GeForce GTX 1070 (ASUS GeForce GTX 1070 STRIX O8G Gaming, 8192 MB GDDR5)
- Intel i7-4790K @ 4.0GHz (4 cores x 2 HT)
- 32 GB RAM
Comment 11 ghoste 2023-03-12 22:47:47 UTC
I had this issue with Plasma 5.25 and maybe 5.24, but I think it stopped happening after upgrading to Plasma 5.26. I'm now running Plasma 5.27.2 and was unable to reproduce it after disabling and re-enabling the compositor 10 times in X11. Using NVIDIA driver 525.89.02.
Comment 12 Kai Krakow 2023-05-02 17:28:37 UTC
I can confirm this. A mostly reliable reproducer is starting a Steam Proton game which has a launcher that detect DirectX capabilities, e.g. the Elite Dangerous Launcher. Such launchers generate a full screen window for an blink of an eye which disables the compositor and automatically enables it again (I'm forcing composition off through a kwin rule which looks for client windows named "(gamescope|steam_app_)", otherwise games stutter unpredictably in my dual monitor setup and both my monitors drift apart with vsync over time (and X11 always syncs all monitors in a single loop thus syncing to the slowest monitor).

This tight disable/enable loop seems to be enough to initially trigger the bug but it usually only occurs when the game ends and closes it's final window (tho, not exclusively, it still may trigger the bug when the launcher starts). The game itself is unaffected. Using shift+alt+F12 to disable compositing cures the flipped and flickering desktop but only until I re-enable compositing. Only restart kwin_x11 fixes it. After it happened once, it is very likely to be triggered more easily, a full reboot is needed to get rid of the behavior for a while.

Games that do not trigger tight enable/disable intervals for compositing seem to be less likely trigger the bug.

"Force full composition pipeline" is enabled to fix tearing in multimonitor setups. It may be part of the problem, I was never seeing that on my single monitor setup (but this is at least 2 years ago).
Comment 13 Nate Graham 2023-05-15 17:39:06 UTC
*** Bug 469809 has been marked as a duplicate of this bug. ***
Comment 14 Tyler Harrison 2023-05-15 18:33:38 UTC
(In reply to ghoste from comment #11)
> I had this issue with Plasma 5.25 and maybe 5.24, but I think it stopped
> happening after upgrading to Plasma 5.26. I'm now running Plasma 5.27.2 and
> was unable to reproduce it after disabling and re-enabling the compositor 10
> times in X11. Using NVIDIA driver 525.89.02.

Unfortunately I am using Plasma 5.27.2 and was able to reproduce this problem (https://bugs.kde.org/show_bug.cgi?id=469809).
Comment 15 ghoste 2023-05-15 19:09:38 UTC
(In reply to Tyler Harrison from comment #14)
> (In reply to ghoste from comment #11)
> > I had this issue with Plasma 5.25 and maybe 5.24, but I think it stopped
> > happening after upgrading to Plasma 5.26. I'm now running Plasma 5.27.2 and
> > was unable to reproduce it after disabling and re-enabling the compositor 10
> > times in X11. Using NVIDIA driver 525.89.02.
> 
> Unfortunately I am using Plasma 5.27.2 and was able to reproduce this
> problem (https://bugs.kde.org/show_bug.cgi?id=469809).

Guess I didn't test it enough. I tried a couple dozens times before posting that by launching and closing a game with gamemoderun disabling/enabling the compositor, but I did it back to back without leaving the compositor off for more than a couple minutes. I suppose there's more conditions that need to be met before this bug is triggered than purely disabling/enabling the compositor.
Comment 16 Graham Perrin 2023-10-28 16:16:51 UTC
Thanks, people. 

(In reply to nyanpasu64 from comment #0)

> Operating System: Arch Linux

Not limited to Linux. 

(In reply to nyanpasu64 from comment #3)

>  … instead of the lock screen. … *sometimes* (not always) the 
> lock screen appeared … IIRC typing the password 
> didn't make the lock screen appear. …

As far as I can tell: whilst kscreenlocker is otherwise effective – visible, for example, during split-seconds within <https://www.youtube.com/watch?v=90tbG_c3-cQ> – content that that should be invisible is 'broken through' the lock by this bug. 

If symptoms recur: 

1. enter your passphrase (as if the lock screen is not interrupted by the flickering)
2. enter your keyboard shortcut for KRunner
3. kwin_x11 --replace

I'll alert a relevant address @kde.org, and someone at NVIDIA.
Comment 17 Zamundaaa 2023-11-02 17:13:21 UTC
*** Bug 476473 has been marked as a duplicate of this bug. ***