SUMMARY I noticed that Spectacle from kdesrc-build starts significantly slower than 23.08.3 from the Arch repos. Looking at the terminal output and system logs, the difference appears to be that Spectacle now turns on the discrete GPU of my laptop on startup, which takes a couple of seconds if it was inactive previously. If I make sure that the discrete GPU is on already, then Spectacle starts about as fast as it did in 23.08.3. In addition to slowing down startup, it uses more power as well, not only because the discrete GPU would be woken up unnecessarily if all I wanted was to take a screenshot, but also because if I were to pick H.264 for the hardware encoding support the discrete GPU ends up getting used instead of the integrated one. STEPS TO REPRODUCE 1. Have a laptop with switchable graphics (possibly with the above mentioned quirk?) 2. Start Spectacle OBSERVED RESULT It starts slower than with 23.08.3, unless I turn on the discrete GPU first by running glxgears or whatever on it. Screen recording with the H.264 codec also uses the discrete GPU instead of the integrated one. EXPECTED RESULT Spectacle starts about as fast as it did with 23.08.3, and uses the integrated GPU for encoding if H.264 is selected. SOFTWARE/OS VERSIONS Operating System: Arch Linux KDE Plasma Version: 5.81.0 KDE Frameworks Version: 5.245.0 Qt Version: 6.6.0 Kernel Version: 6.6.1-arch1-1 (64-bit) Graphics Platform: Wayland Processors: 16 × AMD Ryzen 7 6800H with Radeon Graphics Memory: 30.6 GiB of RAM Graphics Processor: AMD Radeon 680M + AMD Radeon RX 6650M Laptop Model: HP Omen 16-n0000 (n0067AX) ADDITIONAL INFORMATION My system is a bit odd in that in `lspci` the discrete GPU comes before the integrated one (the former is 03:00.0, while the latter is 09:00.0), and /dev/dri/card0 and /dev/dri/renderD128 also both point to the discrete GPU rather than the integrated one. Just about everything else appears to know which one the integrated GPU is and uses that by default though (including vainfo).
By "above mentioned quirk" in the reproduce steps I meant the thing about the discrete GPU coming before the integrated one in the additional information section, oops.
Yeah, I've seen this myself. The problem is probably coming from KPipeWire which is used to do recording and determine which codecs are supported and which encoder to use (automatically uses hardware encoding if available for the selected format). Perhaps there's a faster way to determine which codecs are available than whatever KPipeWire is doing or perhaps we need to find some way to delay the loading of KPipeWire's PipeWireRecord in Spectacle's VideoPlatformWayland so that it's only used when users want to record. Or maybe the problem isn't either, but PipeWireRecord is doing something it doesn't need to do on creation?
For what it's worth vainfo has 3 options for the --display command line option it has, which are wayland, x11 and drm. The wayland (default) and x11 options both return the integrated GPU, but the drm option returns the discrete GPU (and also has that same delay where the discrete GPU needs to wake up first).
(In reply to Prajna Sariputra from comment #3) > For what it's worth vainfo has 3 options for the --display command line > option it has, which are wayland, x11 and drm. The wayland (default) and x11 > options both return the integrated GPU, but the drm option returns the > discrete GPU (and also has that same delay where the discrete GPU needs to > wake up first). Interesting and might have problematic implications for startup optimization or ease of use. FWIW, I only have an integrated AMD GPU on my laptop, but I still definitely noticed that spectacle's startup has gotten slower, so there might be more going on. It would be unfortunate if there was no fast way to check for hardware encoding support on startup.
To clarify, that delay for vainfo only happens with --display drm, with the x11 and wayland display options it's pretty much instant.
Something I forgot to mention here is that with the H.264 format selected the video output is also garbled as a result. I just assumed that it's just because using the discrete/non-primary GPU for encoding has some differences compared to using the integrated/primary GPU for encoding, and it's also not a big deal for Spectacle since it defaults to VP9 encoding (which no AMD or NVIDIA GPU supports anyway), but I gave krdp a try recently and the output there is broken too, and unlike Spectacle it only supports H.264 due to the RDP protocol, so it's a bigger problem there since it means I can't use it at all. So, am I correct in the assumption that the best way to make sure the video output works is to not pick the secondary GPU, or should I file a separate bug report for the garbled video output due to the multi GPU setup? For what it's worth it's definitely not impossible to use the discrete/secondary GPU for hardware VAAPI encoding, OBS can do it for example.
Okay, I just found the bit of code that decides which device to use for hardware encoding (https://invent.kde.org/plasma/kpipewire/-/blob/master/src/vaapiutils.cpp?ref_type=heads#L34), KPipeWire just gets the list of DRM devices and picks the first render device from the list, that says it supports H.264 VAAPI, which in my case is /dev/dri/renderD128, the discrete/secondary GPU. So, I tried adding a hack to make the code pick the second device it finds, which is /dev/dri/renderD129, the integrated GPU, and recompiled KPipeWire (do I also need to recompile Spectacle or anything else?). With that, Spectacle starts noticeably quicker, and recording also starts as soon as I click the button, even when VP9 is selected as the format, whereas before the recording only starts like 3s after I click the button. However, it turns out that even when using the integrated GPU for encoding the video output is still corrupted, but it's a different sort of corrupt, with the discrete GPU I can make nothing out of the source at all, but with the integrated GPU I can see bits of the source. I'll file a new bug report for the corruption with more details.
Very interesting. Thank you for your work.
Can confirm that the issue still exists in plasma 6.0.3. I'm using multi-GPU (AMD-AMD) in Arch.
A possibly relevant merge request was started @ https://invent.kde.org/graphics/spectacle/-/merge_requests/357
Git commit b1686ee96cd62f78dc4cf6cf92d6580376b51250 by Noah Davis. Committed on 26/04/2024 at 14:18. Pushed by ndavis into branch 'master'. VideoPlatformWayland: Get PipeWireRecord asynchronously Speeds up startup by 60-80ms M +4 -2 src/Platforms/VideoPlatform.h M +17 -7 src/Platforms/VideoPlatformWayland.cpp M +2 -0 src/Platforms/VideoPlatformWayland.h https://invent.kde.org/graphics/spectacle/-/commit/b1686ee96cd62f78dc4cf6cf92d6580376b51250
Git commit 445fc422dae9d1e6c702098b74ace66a5e226faf by Noah Davis. Committed on 29/04/2024 at 15:57. Pushed by ndavis into branch 'release/24.05'. VideoPlatformWayland: Get PipeWireRecord asynchronously Speeds up startup by 60-80ms (cherry picked from commit b1686ee96cd62f78dc4cf6cf92d6580376b51250) M +4 -2 src/Platforms/VideoPlatform.h M +17 -7 src/Platforms/VideoPlatformWayland.cpp M +2 -0 src/Platforms/VideoPlatformWayland.h https://invent.kde.org/graphics/spectacle/-/commit/445fc422dae9d1e6c702098b74ace66a5e226faf
Just retested Spectacle with an unmodified KPipeWire so it goes back to using my discrete GPU, and to me it looks like Spectacle now starts about as fast as it does if I modify KPipeWire to use my integrated GPU, so the startup time problem is definitely fixed! Spectacle does still end up turning on the discrete GPU on startup and when recording, but at that point the issue is in KPipeWire rather than Spectacle. So, should this bug be moved over to KPipeWire or should a new bug be made specifically for the wrong GPU selection issue?
Let's move it to KPipeWire