Created attachment 159935 [details] photo of the distorted screen SUMMARY With the default configuration running on Wayland, the entire screen's image on my external display is distorted in a way that makes that screen totally unusable and has an appearance of diagonal stripes from the top right to the bottom left. Looking closely at the distorted image and moving the mouse cursor around on that screen, it seems that each subsequent row of pixels is effectively being shifted to the left by some number of pixels. That is to say, perhaps all of the screen's pixels are being rendered, but the rows are all so misaligned as to be incoherent. Screenshotting with Spectacle doesn't capture the distortion. A photo of the distorted screen is attached. This system has NVIDIA Optimus hybrid graphics with Intel UHD Graphics 630 as the integrated GPU (DRI card0) and NVIDIA GeForce GTX 1070 Mobile as the discrete GPU (DRI card1). The internal display is wired via Embedded DisplayPort to the iGPU; there are Mini DisplayPort and HDMI ports that are both wired to the dGPU. The internal display (which renders as expected) is an AU Optronics panel connected via eDP and has a native resolution of 3840×2160. The external display (affected by this issue) is an ASUS ROG PG348Q connected via mDP and has a native resolution of 3440×1440. The issue manifests identically under all these conditions: • the display is connected via mDP, HDMI, or an mDP-to-HDMI adapter • the display refresh rate is set to any value between 50 and 100 Hz via KScreen • the two screens are rearranged in any manner via KScreen • the scaling factor for either screen is adjusted to any value via KScreen • either screen is chosen as the primary display via KScreen • the EDID for both screens is retrieved via Windows 11 and applied during initramfs startup via the drm.edid_firmware kernel parameter The issue goes away if I set the external display's resolution to anything below 3440×1440 via KScreen. Strangely, KScreen only offers 1024×768, 800×600, and 640×480—the same resolutions as are listed by /sys/class/drm/card1-DP-2/modes—whereas in actuality the display supports many other resolutions, as one would expect, and which xrandr sees just fine. Even more oddly, KScreen offers all the expected resolutions for my internal display, even though /sys/class/drm/card0-eDP-1/modes only lists 3840×2160. Setting KWIN_DRM_DEVICES=/dev/dri/card1:/dev/dri/card0 resolves the issue completely. That is, I have to configure KWin to prioritize my dGPU (NVIDIA) first and my iGPU (Intel) second. I do this by putting the following line in /etc/environment: export KWIN_DRM_DEVICES=/dev/dri/card1:/dev/dri/card0 Doing this in ~/.config/plasma-workspace/env/kwin.sh instead also works to fix the Plasma session, but I prefer to put it in /etc/environment so that SDDM also picks it up. I've configured SDDM to run on Wayland using KWin as its compositor; in this configuration, SDDM suffers the same glitch in the absence of the above setting in /etc/environment. KWin's default behavior seems to be equivalent to KWIN_DRM_DEVICES=/dev/dri/card0:/dev/dri/card1 (iGPU first, dGPU second). Setting KWIN_DRM_DEVICES=/dev/dri/card0 causes only the internal display to be operational (correctly). Setting KWIN_DRM_DEVICES=/dev/dri/card1 causes only the external display to be operational (correctly). The issue is not reproducible using Hyprland or Mutter with their default configurations. STEPS TO REPRODUCE 1. Launch a Plasma Wayland session with the external display connected. OBSERVED RESULT The entire screen on the external display is severely distorted. The screen on the internal display is rendered correctly. EXPECTED RESULT Both screens are rendered correctly. SOFTWARE/OS VERSIONS Operating System: CachyOS Linux KDE Plasma Version: 5.27.6 KDE Frameworks Version: 5.107.0 Qt Version: 5.15.10 Kernel Version: 6.3.9-1-cachyos (64-bit) Graphics Platform: Wayland Processors: 12 × Intel® Core™ i7-8750H CPU @ 2.20GHz Memory: 31.2 GiB of RAM Graphics Processor: NVIDIA GeForce GTX 1070 with Max-Q Design/PCIe/SSE2 Manufacturer: GIGABYTE Product Name: AERO 15XV8 ADDITIONAL INFORMATION $ inxi -Gazy Graphics: Device-1: Intel CoffeeLake-H GT2 [UHD Graphics 630] vendor: Gigabyte driver: i915 v: kernel arch: Gen-9.5 process: Intel 14nm built: 2016-20 ports: active: eDP-1 empty: DP-1 bus-ID: 00:02.0 chip-ID: 8086:3e9b class-ID: 0300 Device-2: NVIDIA GP104M [GeForce GTX 1070 Mobile] vendor: Gigabyte driver: nvidia v: 535.54.03 alternate: nouveau,nvidia_drm non-free: 530.xx+ status: current (as of 2023-05) arch: Pascal code: GP10x process: TSMC 16nm built: 2016-21 pcie: gen: 1 speed: 2.5 GT/s lanes: 16 link-max: gen: 3 speed: 8 GT/s ports: active: none off: DP-2 empty: HDMI-A-1 bus-ID: 01:00.0 chip-ID: 10de:1ba1 class-ID: 0300 Device-3: Sunplus Innovation HD WebCam driver: uvcvideo type: USB rev: 2.0 speed: 480 Mb/s lanes: 1 mode: 2.0 bus-ID: 1-9:5 chip-ID: 1bcf:2c6b class-ID: 0e02 Display: wayland server: X.org v: 1.21.1.8 with: Xwayland v: 23.1.2 compositor: kwin_wayland driver: X: loaded: modesetting,nvidia alternate: fbdev,intel,nouveau,nv,vesa dri: iris gpu: i915,nvidia d-rect: 4459x2112 display-ID: 0 Monitor-1: DP-2 pos: top-right res: 2752x1152 size: N/A modes: N/A Monitor-2: eDP-1 pos: bottom-l res: 1707x960 size: N/A modes: N/A API: OpenGL v: 4.6 Mesa 23.1.3 renderer: Mesa Intel UHD Graphics 630 (CFL GT2) direct-render: Yes $ kscreen-doctor --outputs Output: 1 eDP-1 enabled connected priority 1 Panel Modes: 0:3840x2160@60*! 1:1600x1200@60 2:1280x1024@60 3:1024x768@60 4:2560x1600@60 5:1920x1200@60 6:1280x800@60 7:3840x2160@60 8:3200x1800@60 9:2880x1620@60 10:2560x1440@60 11:1920x1080@60 12:1600x900@60 13:1368x768@60 14:1280x720@60 Geometry: 0,192 1707x960 Scale: 2.25 Rotation: 1 Overscan: 0 Vrr: incapable RgbRange: Full Output: 2 DP-2 enabled connected priority 2 DisplayPort Modes: 0:3440x1440@60! 1:3440x1440@100* 2:3440x1440@95 3:3440x1440@90 4:3440x1440@85 5:3440x1440@80 6:3440x1440@50 7:1024x768@60 8:800x600@60 9:640x480@60 Geometry: 1707,0 2752x1152 Scale: 1.25 Rotation: 1 Overscan: 0 Vrr: incapable RgbRange: unknown
Created attachment 159936 [details] drm_info -j
Created attachment 159937 [details] kscreen-console bug
Created attachment 159938 [details] KWin Support Information
Created attachment 159939 [details] dmesg-drm-debug.log
Created attachment 159940 [details] kwin-drm-debug.log
Created attachment 159941 [details] kscreen.log
Created attachment 159942 [details] KScreen output 2627fe60f26afd08b11e318517ade0ae
Created attachment 159943 [details] KScreen output 2f4e4ef6ae9112f2683b157615664340
Created attachment 159944 [details] edid-decode < /sys/class/drm/card1-DP-2/edid
To be clear: this issue appears to be specific to KWin on Wayland. It does not occur with other Wayland compositors (at least, not with Hyprland or Mutter in their default configurations), nor with KWin on X11.
A possibly relevant merge request was started @ https://invent.kde.org/plasma/kwin/-/merge_requests/4219
Git commit e698cafa2737904255d47d1e1710ee857cde9afd by Xaver Hugl. Committed on 04/07/2023 at 15:33. Pushed by zamundaaa into branch 'master'. backends/drm: handle mismatching stride with CPU copying M +12 -12 src/backends/drm/drm_egl_layer_surface.cpp https://invent.kde.org/plasma/kwin/-/commit/e698cafa2737904255d47d1e1710ee857cde9afd
Git commit f1f7e2697d1ac4ebe9099006775717e5fd6f5777 by Xaver Hugl. Committed on 05/07/2023 at 09:11. Pushed by zamundaaa into branch 'Plasma/5.27'. backends/drm: handle mismatching stride with CPU copying M +7 -6 src/backends/drm/drm_buffer_gbm.cpp M +7 -1 src/backends/drm/drm_buffer_gbm.h M +12 -8 src/backends/drm/drm_egl_layer_surface.cpp https://invent.kde.org/plasma/kwin/-/commit/f1f7e2697d1ac4ebe9099006775717e5fd6f5777
Unfortunately, the issue persists even after updating to Plasma 5.27.7. Unsetting KWIN_DRM_DEVICES brings back the same distortion as before. Are there any other diagnostics I can provide that might help pin this down? I also have the KDE/Plasma 5 sources set up for building. Happy to try out patches without waiting for a release.
Created attachment 160889 [details] log what stride is used I can't reproduce the problem myself; with 5.27.6 I see the problem with the 1680x1050 resolution, but with 5.27.7 it's gone. Could you please run KWin with the attached patch, and see what strides the buffers have?
That f1f7e2697d1ac4ebe9099006775717e5fd6f5777 commit totally wrecks VRR, especially when using DisplayPort in my case. Selectively reverting it fixes the issue with HDMI (which was less affected) and DP. My screen will flicker whenever the refresh rate goes below a certain threshold (~78fps for me w/ Gsync pendulum and MangoHud enforcing VRR). It will also do so seemingly random when operating the computer and occasionally produces a glitched bar on the botton of the screen, showing elements that should be on the top (i.e like my top panel). Just guessing here, but could it be that some timing or buffer handling is off here or that this was data already belonging to the next frame placed at some weird offset? !Anyhow, I'd recommend reverting f1f7e2697d1ac upstream ASAP, since for anyone affected, their GUI becomes practically unusable and the flickering may cause seizures in people with epilepsy. Some may also think that they got HW damage… Thank you! Plasma 5.27.7, Gentoo, tried Kernel 6.1.45-lts, 6.3.13 and 6.4.10. Radeon 6800XT, Mesa 23.1.5, libdrm 2.4.115 AOC Q24G2 1440p@165Hz (144Hz on HDMI, which somehow made the flickers happen less often).
(In reply to Stefan Springer from comment #16) Just a small addition. KWin's VRR setting (Auto/Disabled/Always) didn't matter particularly much, with Disabled reducing the likelihood of occurrence the most. What could workaround the problem entirely without reverting the commit, was using the monitor's OSD to completely disable VRR support (it's called "G-Sync compatible" there).
(In reply to Stefan Springer from comment #16) > My screen will flicker whenever the refresh rate goes below a certain > threshold (~78fps for me w/ Gsync pendulum and MangoHud enforcing VRR). It > will also do so seemingly random when operating the computer and > occasionally produces a glitched bar on the botton of the screen, showing > elements that should be on the top (i.e like my top panel). Did you actually revert the commit and test that, and what's your second GPU? Because if you do have mismatching stride, then your output would've been completely unusable before the commit. And if you don't, then the commit doesn't change anything. (In reply to Ivan D Vasin from comment #14) > Unfortunately, the issue persists even after updating to Plasma 5.27.7. > Unsetting KWIN_DRM_DEVICES brings back the same distortion as before. Are > there any other diagnostics I can provide that might help pin this down? I > also have the KDE/Plasma 5 sources set up for building. Happy to try out > patches without waiting for a release. ping on the patch. I can't really do anything without narrowing the problem down.
(In reply to Zamundaaa from comment #18) > Did you actually revert the commit and test that, and what's your second > GPU? Because if you do have mismatching stride, then your output would've > been completely unusable before the commit. And if you don't, then the > commit doesn't change anything. Long story short: I can now say with certainty that my specific issue is not KDE related. Sorry for making such a fuzz! So after reverting the patch, things seemed fine for two days and a reboot, but then the issue got triggered again out of seemingly nowhere. Following some more digging around, and even trying an old 5.19 LTS kernel, I decided to create a new, blank user account to see if it also exhibits the issue. Surprisingly, it wasn't present on a fresh acc. with default settings. I step-wise reconstructed the settings of my main account to see if it reoccurred. The trigger was touching mclk_od on 6800XT (and probably also RX 470 when thinking about the weird behavior on my other PC; for some reason [HBM?] Vega 56 is fine), even if it's just a change of 5Mhz up or down, it will reliably trigger the problem. I manually played around with the pp_od_clk_voltage sysfs interface to confirm this, and ultimately disabled the part of my OC script that adjusts mclk; now everything works. Multiple reboots, logging off and on again, the gsync pendulum at any arbitrary refresh rate, etc… I was using a dual-monitor 144Hz config before, then, after moving, single monitor with 144Hz over HDMI because I grabbed the wrong cable, and only when I finally got around to getting a DP cable (3 in fact, since it looked like I got two defective ones at first), which allowed me to drive my monitor at 165Hz (still single-monitor use now), the issue was unmasked. That's what originally had me sifting through a month worth of system updates and commits. Currently, when looking at the AMDGPU issue tracker, there is fundamental restructuring going on in DC (Display Core) code, with issues around mclk handling being present (It actually broke badly in Kernel 6.4, such that desktop RDNA2 simply won't increase mclk at all, for some OOTB, for others with specific monitors/resolutions/refresh rates and for others once touching mclk_od), aswell as problems with calculating the required bandwidth for a given monitor config (res, hz, multi-display, VRR), so this seems related to that.
Thanks, the fact that you're overclocking your GPU is highly relevant here. :)