Bug 492319

Summary: On laptop with Radeon 780M GPU, 6.1.4 causes Triple Buffering to produce graphical corruption in XWayland apps and lag in all apps
Product: [Plasma] kwin Reporter: Nate Graham <nate>
Component: performanceAssignee: KWin default assignee <kwin-bugs-null>
Status: REPORTED ---    
Severity: major CC: fabian+kdebugs, micraft.b, xaver.hugl
Priority: NOR Keywords: regression
Version: 6.1.4   
Target Milestone: ---   
Platform: Other   
OS: Linux   
See Also: https://bugs.kde.org/show_bug.cgi?id=493073
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Nate Graham 2024-08-28 14:04:56 UTC
SUMMARY
My wife has started to experience a series of annoying graphical glitches on her laptop (2023 HP Pavilion Plus 14, 7840U CPU, Radeon 780M GPU, 2.8k OLED screen @ 200% scale, Mesa 24.1.6, Kernel 6.10.6, amd-gpu-firmware-20240811) that was introduced with the Plasma 6.1.4 update, and are resolved by disabling triple buffering.


SYMPTOMS
Symptom 1: Starting from the moment the session is logged in, there is transient graphical corruption manifesting in thin horizontal bands of static that briefly flash across the screen while XWayland apps are focused and visually updating. The issue is seen in:
- Krita (Native Fedora package)
- Steam (Native Fedora package)
- Ungoogled Chromium (Flatpak)

Symptom 2: All apps on the system begin to graphically lag, with the lag getting progressively worse the longer the system is used for. Specific examples:
- Typed keystrokes in any app (including native Wayland KDE apps) will not be shown on the screen for a few milliseconds.
- Drawn strokes in Krita will lag behind the pen.
- YouTube videos playing in either Ungoogled Chromium (XWayland, Flatpak) or Firefox (Native Fedora packaging, Wayland) will begin to skip frames and the audio will start to stutter. The effect is much worse in Ungoogled Chromium, though it manifests to an extent in Firefox as well

Symptom 3: `radeontop` shows that VRAM is over-subscribed: from the moment the session is logged in, it says it's at over 100% VRAM usage. typically about 110%. As apps are opened and used, this will rise to 120% or so.

While any of these issues are manifesting, the system feels relatively cool to the touch and the fans are not going crazy. CPU, main memory, and storage resources are not being stressed. Only the GPU appears to be stressed.


SOLUTION
We discovered that setting KWIN_DRM_DISABLE_TRIPLE_BUFFERING=1 fixes all of the above issues. Notably, for symptom 3, the VRAM usage on boot goes down to only about 90%, which still seems much too high and may be worth separately debugging. This might have been caused by the changes that fixed Bug 488843, which were released in Plasma 6.1.4.


OTHER SOLUTIONS ATTEMPTED
We previously tried rolling back the kernel, mesa, and amd-gpu-firmware to significantly older versions, but to no effect. Only disabling triple buffering helps.
Comment 1 Fabian Blaese 2024-09-05 10:17:16 UTC
I am observing similar problems with stutter.

To me, it looks like this problem is caused solely by the high VRAM consumption, and (possibly) the resulting usage of GTT memory: With multiple monitors connected to my 7840U Laptop, I am observing the stutter regardless of kwins triple buffering settings. Connecting multiple external monitors causes VRAM usage to pretty much always get beyond the 512 MiB of VRAM the firmware of my Laptop has allocated, even with only a few open applications.

I can work around this issue by enabling a gaming option in the UEFI of the laptop, so 4 GiB of VRAM are reserved. In this case, the stutter is not observable, even with triple buffering enabled.
Comment 2 Nate Graham 2024-09-05 11:02:19 UTC
If disabling triple buffering doesn't help for your issue, it sounds like a different one, though possibly related. Can you open a new bug report about it? Thanks!
Comment 3 Fabian Blaese 2024-09-05 11:12:36 UTC
I would have, but I do think that these issues are very closely related, if not the same.

I don't think that disabling triple buffering fixed the stuttering for you. I have the feeling that the associated reduced memory consumption fixed it. For me, disabling triple buffering is not suffucient to reduce VRAM consumption to less than 100%, at least with multiple external monitors connected.

If your device has an option to increase the dedicated VRAM allocation, it might be worth checking if that fixes the stuttering for you as well. If it does not, I will create a new bug report.
Comment 4 Nate Graham 2024-09-05 11:18:22 UTC
Disabling triple buffering did reduce VRAM usage, but only by a small amount. Most of the time it was still listed as over 100% on login. GTT was low regardless of whether it's on or off.

I talked to a KWin developer who said that the VRAM usage reported by `radeontop` is not really trustworthy since it has a weird way of measuring VRAM usage. Apparently it always shows up as close to 100% due to the way it counts shared memory allocations.

I'll look in the device's UEFI to see if it has an option for this, but I'm not hopeful on that front. It's not a fancy gaming laptop, just an HP Pavilion Plus 14.
Comment 5 Nate Graham 2024-09-17 21:26:39 UTC
*** Bug 493073 has been marked as a duplicate of this bug. ***
Comment 6 Zamundaaa 2024-09-18 13:38:29 UTC
(In reply to Fabian Blaese from comment #1)
> I am observing similar problems with stutter.
> 
> To me, it looks like this problem is caused solely by the high VRAM
> consumption, and (possibly) the resulting usage of GTT memory: With multiple
> monitors connected to my 7840U Laptop, I am observing the stutter regardless
> of kwins triple buffering settings. Connecting multiple external monitors
> causes VRAM usage to pretty much always get beyond the 512 MiB of VRAM the
> firmware of my Laptop has allocated, even with only a few open applications.
> 
> I can work around this issue by enabling a gaming option in the UEFI of the
> laptop, so 4 GiB of VRAM are reserved. In this case, the stutter is not
> observable, even with triple buffering enabled.
The amount of memory you see as "VRAM" is not what the GPU can use, it's a very small amount reserved for the GPU. Using more than it should never have a noticeable performance impact - that "gaming" option is mostly a workaround for buggy games that assume they always run on a dedicated GPU and don't launch unless they have x amount of "VRAM".
Most likely the "gaming" option changes more than the amount of reserved memory.