SUMMARY I noticed that Spectacle from kdesrc-build starts significantly slower than 23.08.3 from the Arch repos. Looking at the terminal output and system logs, the difference appears to be that Spectacle now turns on the discrete GPU of my laptop on startup, which takes a couple of seconds if it was inactive previously. If I make sure that the discrete GPU is on already, then Spectacle starts about as fast as it did in 23.08.3. In addition to slowing down startup, it uses more power as well, not only because the discrete GPU would be woken up unnecessarily if all I wanted was to take a screenshot, but also because if I were to pick H.264 for the hardware encoding support the discrete GPU ends up getting used instead of the integrated one. STEPS TO REPRODUCE 1. Have a laptop with switchable graphics (possibly with the above mentioned quirk?) 2. Start Spectacle OBSERVED RESULT It starts slower than with 23.08.3, unless I turn on the discrete GPU first by running glxgears or whatever on it. Screen recording with the H.264 codec also uses the discrete GPU instead of the integrated one. EXPECTED RESULT Spectacle starts about as fast as it did with 23.08.3, and uses the integrated GPU for encoding if H.264 is selected. SOFTWARE/OS VERSIONS Operating System: Arch Linux KDE Plasma Version: 5.81.0 KDE Frameworks Version: 5.245.0 Qt Version: 6.6.0 Kernel Version: 6.6.1-arch1-1 (64-bit) Graphics Platform: Wayland Processors: 16 × AMD Ryzen 7 6800H with Radeon Graphics Memory: 30.6 GiB of RAM Graphics Processor: AMD Radeon 680M + AMD Radeon RX 6650M Laptop Model: HP Omen 16-n0000 (n0067AX) ADDITIONAL INFORMATION My system is a bit odd in that in `lspci` the discrete GPU comes before the integrated one (the former is 03:00.0, while the latter is 09:00.0), and /dev/dri/card0 and /dev/dri/renderD128 also both point to the discrete GPU rather than the integrated one. Just about everything else appears to know which one the integrated GPU is and uses that by default though (including vainfo).
By "above mentioned quirk" in the reproduce steps I meant the thing about the discrete GPU coming before the integrated one in the additional information section, oops.
Yeah, I've seen this myself. The problem is probably coming from KPipeWire which is used to do recording and determine which codecs are supported and which encoder to use (automatically uses hardware encoding if available for the selected format). Perhaps there's a faster way to determine which codecs are available than whatever KPipeWire is doing or perhaps we need to find some way to delay the loading of KPipeWire's PipeWireRecord in Spectacle's VideoPlatformWayland so that it's only used when users want to record. Or maybe the problem isn't either, but PipeWireRecord is doing something it doesn't need to do on creation?
For what it's worth vainfo has 3 options for the --display command line option it has, which are wayland, x11 and drm. The wayland (default) and x11 options both return the integrated GPU, but the drm option returns the discrete GPU (and also has that same delay where the discrete GPU needs to wake up first).
(In reply to Prajna Sariputra from comment #3) > For what it's worth vainfo has 3 options for the --display command line > option it has, which are wayland, x11 and drm. The wayland (default) and x11 > options both return the integrated GPU, but the drm option returns the > discrete GPU (and also has that same delay where the discrete GPU needs to > wake up first). Interesting and might have problematic implications for startup optimization or ease of use. FWIW, I only have an integrated AMD GPU on my laptop, but I still definitely noticed that spectacle's startup has gotten slower, so there might be more going on. It would be unfortunate if there was no fast way to check for hardware encoding support on startup.
To clarify, that delay for vainfo only happens with --display drm, with the x11 and wayland display options it's pretty much instant.
Something I forgot to mention here is that with the H.264 format selected the video output is also garbled as a result. I just assumed that it's just because using the discrete/non-primary GPU for encoding has some differences compared to using the integrated/primary GPU for encoding, and it's also not a big deal for Spectacle since it defaults to VP9 encoding (which no AMD or NVIDIA GPU supports anyway), but I gave krdp a try recently and the output there is broken too, and unlike Spectacle it only supports H.264 due to the RDP protocol, so it's a bigger problem there since it means I can't use it at all. So, am I correct in the assumption that the best way to make sure the video output works is to not pick the secondary GPU, or should I file a separate bug report for the garbled video output due to the multi GPU setup? For what it's worth it's definitely not impossible to use the discrete/secondary GPU for hardware VAAPI encoding, OBS can do it for example.
Okay, I just found the bit of code that decides which device to use for hardware encoding (https://invent.kde.org/plasma/kpipewire/-/blob/master/src/vaapiutils.cpp?ref_type=heads#L34), KPipeWire just gets the list of DRM devices and picks the first render device from the list, that says it supports H.264 VAAPI, which in my case is /dev/dri/renderD128, the discrete/secondary GPU. So, I tried adding a hack to make the code pick the second device it finds, which is /dev/dri/renderD129, the integrated GPU, and recompiled KPipeWire (do I also need to recompile Spectacle or anything else?). With that, Spectacle starts noticeably quicker, and recording also starts as soon as I click the button, even when VP9 is selected as the format, whereas before the recording only starts like 3s after I click the button. However, it turns out that even when using the integrated GPU for encoding the video output is still corrupted, but it's a different sort of corrupt, with the discrete GPU I can make nothing out of the source at all, but with the integrated GPU I can see bits of the source. I'll file a new bug report for the corruption with more details.
Very interesting. Thank you for your work.
Can confirm that the issue still exists in plasma 6.0.3. I'm using multi-GPU (AMD-AMD) in Arch.
A possibly relevant merge request was started @ https://invent.kde.org/graphics/spectacle/-/merge_requests/357
Git commit b1686ee96cd62f78dc4cf6cf92d6580376b51250 by Noah Davis. Committed on 26/04/2024 at 14:18. Pushed by ndavis into branch 'master'. VideoPlatformWayland: Get PipeWireRecord asynchronously Speeds up startup by 60-80ms M +4 -2 src/Platforms/VideoPlatform.h M +17 -7 src/Platforms/VideoPlatformWayland.cpp M +2 -0 src/Platforms/VideoPlatformWayland.h https://invent.kde.org/graphics/spectacle/-/commit/b1686ee96cd62f78dc4cf6cf92d6580376b51250
Git commit 445fc422dae9d1e6c702098b74ace66a5e226faf by Noah Davis. Committed on 29/04/2024 at 15:57. Pushed by ndavis into branch 'release/24.05'. VideoPlatformWayland: Get PipeWireRecord asynchronously Speeds up startup by 60-80ms (cherry picked from commit b1686ee96cd62f78dc4cf6cf92d6580376b51250) M +4 -2 src/Platforms/VideoPlatform.h M +17 -7 src/Platforms/VideoPlatformWayland.cpp M +2 -0 src/Platforms/VideoPlatformWayland.h https://invent.kde.org/graphics/spectacle/-/commit/445fc422dae9d1e6c702098b74ace66a5e226faf
Just retested Spectacle with an unmodified KPipeWire so it goes back to using my discrete GPU, and to me it looks like Spectacle now starts about as fast as it does if I modify KPipeWire to use my integrated GPU, so the startup time problem is definitely fixed! Spectacle does still end up turning on the discrete GPU on startup and when recording, but at that point the issue is in KPipeWire rather than Spectacle. So, should this bug be moved over to KPipeWire or should a new bug be made specifically for the wrong GPU selection issue?
Let's move it to KPipeWire
*** Bug 499373 has been marked as a duplicate of this bug. ***
I'm now on KDE Plasma 6.3.3 (including KPipeWire and Spectacle), and I just noticed that both Spectacle and KRdp now correctly use my integrated GPU for encoding without bothering the discrete GPU (which I can observe by the fact that running `lspci` or `nvtop` in a terminal soon after Spectacle now shows a delay indicating that those two terminal commands are waiting for the dGPU to start, while before they would not show a delay indicating that Spectacle powered on the dGPU already, plus `nvtop` also shows the iGPU's encoder is used when Spectacle is recording the screen or KRdp has an active connection). However, when looking into KPipeWire's code and commit history I didn't see any change that looks like they would be the fix to this bug, at least as far as my limited understanding goes. In particular, it looks like the code to select devices didn't change (https://invent.kde.org/plasma/kpipewire/-/blob/master/src/vaapiutils.cpp?ref_type=heads#L34) and is still just picking the first device to show up. The order the GPUs appear in the rest of the OS doesn't seem to have changed either, the command `vainfo --display drm` still reports the dGPU (unlike just `vainfo` which does report the iGPU like before) and /dev/dri/renderD128 is still the dGPU. So, I'm not certain if this is really fixed or if this is just a side effect of something else in the system. For what it's worth the KDE Frameworks version is 6.12.0 and the Linux kernel version is 6.13.7, on Arch Linux.
Can confirm that issue is resolved for me as well, setup nearly identical to comment #16 except it's on CachyOS.
There could have been an improvement in your drivers or libva. If that's the case, we could mark this as RESOLVED UPSTREAM.
@Valeri Is your discrete GPU from AMD as well? If so then it does look like either the drivers or libva got an improvement, although if it is the drivers specifically rather than libva or another hardware agnostic component then NVIDIA users like from bug 499373 might still be affected. But then again it's not like NVIDIA even has official VAAPI support (the user in said bug report had to install an unofficial wrapper to get VAAPI support on their dGPU) so by default only the iGPU would support VAAPI meaning KPipeWire won't have a choice, or if there is no iGPU then either there would be no VAAPI devices at all or there would only be one if the wrapper is present and again KPipeWire won't have a choice. Plus software like FFmpeg and OBS Studio tend to have support for NVIDIA's proprietary NVENC API so anyone who has an iGPU + NVIDIA dGPU setup and wishes to use the dGPU for encoding may not need the unofficial VAAPI wrapper installed anyway.
(In reply to Prajna Sariputra from comment #19) > @Valeri Is your discrete GPU from AMD as well? If so then it does look like > either the drivers or libva got an improvement, although if it is the > drivers specifically rather than libva or another hardware agnostic > component then NVIDIA users like from bug 499373 might still be affected. I am on an Nvidia dGPU and am, in fact, the person from that duplicate bug report :). I've reinstalled the nvidia-vaapi-driver wrapper as part of the test and can confirm that it works and KPipeWire leaves it alone, just as expected.
Oh silly me, well then I think we can safely say that this has been resolved upstream and is not driver specific too.