Bug 461824 - Fullscreen Vulkan issues when game is running on dedicated GPU on multi GPU setup
Summary: Fullscreen Vulkan issues when game is running on dedicated GPU on multi GPU s...
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: wayland-generic (show other bugs)
Version: 5.26.3
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-11-14 16:18 UTC by Minik
Modified: 2023-07-14 09:05 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Diagnostic_logs (48.04 KB, application/gzip)
2022-11-15 15:42 UTC, Minik
Details
drm info with vulkan fullscreen on external monitor (145.73 KB, application/octet-stream)
2022-12-07 23:04 UTC, raffarti
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Minik 2022-11-14 16:18:02 UTC
SUMMARY
Several problems when combining 2 gpu's usage in fullscreen mode on external screens (plasmashell always reports running on integrated GPU (iGPU), while games / apps run on dedicated GPU (dGPU) for maximum performance).
My laptop have 2 video output ports, one connected to iGPU and second to dGPU.
If external screen is connected to iGPU, then fullscreen Vulkan games running on dGPU show just black screen with flashes of gameplay when changing volume (when plasma volume bar shows up).
If external screen is connected to dGPU, then these games work fine, but if volume bar shows up (or something else from shell on top of the window) game freezes completely. 

Problems occur only for Vulkan apps / games, only for external monitor and only when in fullscreen mode (using KWin window rules to disable fullscreen, set manually size to fill the screen and keep the window on top mitigates the issue).

Tested only on Wayland, because something is bugged and the whole system runs at 1FPS on X11 (not related issue).

STEPS TO REPRODUCE
1. Connect external monitor to a laptop using iGPU + dGPU
2. Run Vulkan game in fullscreen mode on dGPU
3. Change system volume and let the volume bar disappear

OBSERVED RESULT
Game freezes or black screen

EXPECTED RESULT
Game working fine

SOFTWARE/OS VERSIONS
Linux: EndeavourOS 6.0.5-arch1-g14-1
KDE Plasma Version: 5.26.3
KDE Frameworks Version: 5.99.0
Qt Version: 5.15.7
Graphics Platform: Wayland

ADDITIONAL INFORMATION
My laptop is Asus G14 RK402RJ (iGPU: Ryzen 7 6800HS + dGPU: AMD RX 6700S)
Comment 1 Zamundaaa 2022-11-14 22:53:37 UTC
That sounds like the problem only happens with cross-gpu direct scanout. To verify, can you put
KWIN_DRM_NO_DIRECT_SCANOUT=1
into /etc/environment, reboot and check if the problem still happens?
Comment 2 Minik 2022-11-15 07:08:12 UTC
The flag KWIN_DRM_NO_DIRECT_SCANOUT=1 worked for both cases, games no longer freeze on dGPU and show even on iGPU connection.
Comment 3 Zamundaaa 2022-11-15 14:23:16 UTC
thanks, that should narrow down the possible causes. With that environment variable removed again and
QT_LOGGING_RULES="kwin_wayland_*.debug=true"
set instead, please cause the problem again and afterwards upload the outputs of
> journalctl --user-unit plasma-kwin_wayland --boot 0
and
> sudo dmesg

If you can cause the issue on one screen while being able to still use the other screen, the output of drm_info (https://gitlab.freedesktop.org/emersion/drm_info) once when you just show the desktop and once when the fullscreen image is stuck could be useful as well.
Comment 4 Minik 2022-11-15 15:42:12 UTC
Created attachment 153765 [details]
Diagnostic_logs

Here are my diagnostic logs (dmesg, journalctl, drm_info for desktop and fullscreen) compressed in .tar.gz
Comment 5 Bug Janitor Service 2022-11-30 05:16:25 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 6 raffarti 2022-12-07 22:35:07 UTC
Hello, I believe my system is affected by the same bug:
- happens when the iGPU is the main gpu and dGPU is secondary (display is attached to the secondary GPU)
- does not happen when the iGPU is disabled
- game freezes when direct rendering is used after being indirect (alt-tab, notifications, etc)
- games renders correctly afterwards if rendering is kept indirect (e.g. by using the kwin FPS effect)

Tested with:
iGPU + dGPU (bugged)
iGPU + usb-to-hdmi (bugged)
dGPU + usb-to-hdmi (working)

as the latter is working, perhaps mesa bug?

while bugged journalctl prints lots of 
```
kwin_wayland_drm: Atomic commit failed! Invalid argument
kwin_wayland_drm: Presentation failed! Invalid argument
```

System:
5800H+6600M
Comment 7 raffarti 2022-12-07 22:37:51 UTC
> Tested with:
> iGPU + dGPU (bugged)
> iGPU + usb-to-hdmi (bugged)
> dGPU + usb-to-hdmi (working)
> 
> as the latter is working, perhaps mesa bug?

Nvm, actually it's just that 3rd case is the other way around, the usb-to-hdmi adapter doesn't do the 3D acceleration XD
Comment 8 raffarti 2022-12-07 22:59:48 UTC
dmesg is flooded by
`[drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer with error -22`
nothing else to see there...
Comment 9 raffarti 2022-12-07 23:04:37 UTC
Created attachment 154409 [details]
drm info with vulkan fullscreen on external monitor
Comment 10 Zamundaaa 2022-12-08 13:34:30 UTC
(In reply to Minik from comment #4)
> Created attachment 153765 [details]
> Diagnostic_logs
> 
> Here are my diagnostic logs (dmesg, journalctl, drm_info for desktop and
> fullscreen) compressed in .tar.gz

I see no relevant warnings in dmesg or journalctl. From the drm_info contents though I can see that you do get direct scanout with fullscreen, and the game provides a XRGB8888 buffer, while KWin uses the ARGB2101010 format when it does compositing.
It's possible that switching between these formats triggers some bug in the kernel. To check if it's relevant you can put KWIN_DRM_PREFER_COLOR_DEPTH=24 into /etc/environment, reboot, and see if the problem can still be caused.

No matter if that works or not, it might be useful to get more detailed logging for the broken setup from the kernel, which you can do by following the instructions at https://invent.kde.org/plasma/kwin/-/wikis/Debugging-DRM-issues. Keep in mind that it's very verbose, so the time you record dmesg should be as short as possible.

(In reply to raffarti from comment #8)
> dmesg is flooded by
> `[drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer
> with error -22`
> nothing else to see there...
Sounds exactly like https://gitlab.freedesktop.org/drm/amd/-/issues/2075. Please add a comment to that, to show you're affected as well, and to hopefully make someone at AMD attempt to fix the problem. Putting KWIN_DRM_NO_DIRECT_SCANOUT=1 into /etc/environment should be a usable workaround for you.
Comment 11 raffarti 2022-12-08 21:13:45 UTC
> (In reply to raffarti from comment #8)
> > dmesg is flooded by
> > `[drm:dm_plane_helper_prepare_fb [amdgpu]] *ERROR* Failed to pin framebuffer
> > with error -22`
> > nothing else to see there...
> Sounds exactly like https://gitlab.freedesktop.org/drm/amd/-/issues/2075.
> Please add a comment to that, to show you're affected as well, and to
> hopefully make someone at AMD attempt to fix the problem. Putting
> KWIN_DRM_NO_DIRECT_SCANOUT=1 into /etc/environment should be a usable
> workaround for you.

Done so, thanks.
Comment 12 raffarti 2023-07-13 14:15:39 UTC
What fixes this?
Comment 13 Zamundaaa 2023-07-14 08:57:10 UTC
"Resolved upstream" doesn't mean that it's fixed, it means that this is a problem outside of our software. In this case the bug is in the GPU driver, so there's nothing that can be done about it in KDE
Comment 14 raffarti 2023-07-14 09:05:09 UTC
(In reply to Zamundaaa from comment #13)
> "Resolved upstream" doesn't mean that it's fixed, it means that this is a
> problem outside of our software. In this case the bug is in the GPU driver,
> so there's nothing that can be done about it in KDE

Oh I see. I though it meant the external problem was fixed. Thanks.