Bug 464669 - Performance issue with Vulkan applications on Wayland with Intel ANV
Summary: Performance issue with Vulkan applications on Wayland with Intel ANV
Status: RESOLVED UPSTREAM
Alias: None
Product: kwin
Classification: Plasma
Component: wayland-generic (show other bugs)
Version: 5.26.5
Platform: Arch Linux Linux
: NOR normal
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-01-22 22:43 UTC by tobi291019
Modified: 2023-01-23 12:05 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Hacked together patch that fixes the issue on my end (3.76 KB, patch)
2023-01-22 22:43 UTC, tobi291019
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tobi291019 2023-01-22 22:43:48 UTC
Created attachment 155519 [details]
Hacked together patch that fixes the issue on my end

SUMMARY
Starting with Mesa 22.3 there is a performance issue running Vulkan applications on KWin on Intel systems with an integrated GPU.
Annoyingly since 22.0 there is a separate bug causing such a performance regression (https://gitlab.freedesktop.org/mesa/mesa/-/issues/7019), which was fixed in 22.3, hence only a handful of commits during 22.3 development actually show expected performance.
I have bisected this on the Mesa side to https://gitlab.freedesktop.org/mesa/mesa/-/commit/db42ed1e04cc7c9b92fb22cc2eef7f62e73aabba.
This commit introduces a check to verify that the compositor is using the same GPU as the client, using zwp_linux_dmabuf_feedback_v1::main_device.
On my system KWin reports the major/minor numbers for /dev/dri/card0 (226, 0), whilst Mesa expects the ones for /dev/dri/renderD128 (226, 128).

Now the fun part is that the wayland protocol documentation has conflicting information regarding what device should be reported by the compositor, namely https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml#L484-487 claims this to be unspecified and https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/feedback.rst#for-compositors claims it should be set to the rendering device.
So either Mesa should not be comparing the dev_t the way it does or KWin is reporting the wrong device.
Both Sway (1.8) and Weston (11.0.0) do in fact set the main_device to /dev/dri/renderD128, unlike KWin, and performance is fine with them.

I have attached a quickly hacked together attempt at fixing this issue in KWin, by reporting the render device node instead, which does fix the performance issue and so far has had no ill effects.

If people think this should rather be fixed/changed on the Mesa side (as per the xml protocol documentation), I can also open up an issue over there instead.

STEPS TO REPRODUCE
1. Run a Vulkan application, e.g. vkmark.
2. Compare performance on KWin to another compositor, e.g. Weston.

OBSERVED RESULT
Running on Weston it is considerably faster. (vkmark score 5378 <-> 1382)

EXPECTED RESULT
KWin and Weston should have comparable performance.

SOFTWARE/OS VERSIONS
KDE Plasma Version: 5.26.5
KDE Frameworks Version: 5.102.0
Qt Version: 5.15.8
Mesa: 22.3.3

ADDITIONAL INFORMATION
CPU: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
GPU: HD Graphics 620 (KBL GT2)
Comment 1 Zamundaaa 2023-01-23 12:05:56 UTC
The protocol text says
> In general the device is a DRM node. The DRM node type (primary vs.
> render) is unspecified. Clients must not rely on the compositor sending
> a particular node type. Clients cannot check two devices for equality
> by comparing the dev_t value
and the feedback.rst says
> Because two DRM nodes can refer to the same DRM device while having different dev_t values, clients should use drmDevicesEqual to compare two devices

So Mesa is doing it wrong and needs to be fixed.