Bug 493741

Summary: Random lag spike on external screen of Nvidia hybrid laptop
Product: [Plasma] kwin Reporter: Martel Theo <marteltheo>
Component: performanceAssignee: KWin default assignee <kwin-bugs-null>
Status: RESOLVED UPSTREAM    
Severity: normal CC: nate, xaver.hugl
Priority: NOR    
Version First Reported In: 6.1.5   
Target Milestone: ---   
Platform: Arch Linux   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: Quick plot of the two frametime traces
Frametime data in KWin
Frametime data in Mutter

Description Martel Theo 2024-09-27 16:37:06 UTC
Created attachment 174146 [details]
Quick plot of the two frametime traces

SUMMARY

When using an external screen with my laptop, I see random lag spikes on the image of this external screen, roughly once per minute. During these lag spikes, the frame time goes from ~16 ms to 150-300 ms for a few frames, before going back to normal.

While testing, I saw that this specific problem does not appear with a Windows install, nor with latest Mutter (47.0). So it seems it is coming from something Plasma specific, and not from a problem with my hardware or with Nvidia linux drivers in general.


SOFTWARE/OS VERSIONS
KDE Plasma Version: 6.1.5
KDE Frameworks Version: 6.6.0
Qt Version: 6.7.2
Kernel Version: 6.10.10.arch1-1 (64-bit)
Processors: 16 × 11th Gen Intel® Core™ i7-11850H @ 2.50GHz
Memory: 14.8 Gio of RAM
Graphics Processor: Mesa Intel® UHD Graphics / NVIDIA RTX A3000 Laptop GPU/PCIe/SSE2
Nvidia driver version : nvidia-open 560.35.03

ADDITIONAL INFORMATION

I used mangohud with vkcube (using Nvidia GPU) running maximized on the external screen to capture the frametimes traces attached

I have tried multiple configurations in order to better determine the source of these lags:
 - X11/Wayland : I see the same type of lag spikes. Wayland has more problems still with Nvidia, so I continued to investigate only with X11
 - Hybrid graphics/Nvidia only (BIOS configuration) : this specific problem disappears in Nvidia only configuration (but the frametime is more chaotic ...)
 - Output resolution 1080p/4K : no change
 - Laptop output DP/HDMI : no change
 - I tried multiple screen configurations with different external screens, with no impact on the lag spikes
 - KWin/Mutter : Lag spikes disappear in Mutter

I don't have any ideas on where/how to investigate further, so I'm reporting the issue here
Comment 1 Martel Theo 2024-09-27 16:38:17 UTC
Created attachment 174148 [details]
Frametime data in KWin
Comment 2 Martel Theo 2024-09-27 16:38:48 UTC
Created attachment 174149 [details]
Frametime data in Mutter
Comment 3 Zamundaaa 2024-09-30 17:15:27 UTC
KWin doesn't influence application's synchronization with the screen's refresh rate on Xorg, so even if this doesn't happen in other environments, it's still a driver bug, which you can report at https://forums.developer.nvidia.com/c/gpu-graphics/linux
Note that if you have a widget or something that monitors GPU usage or something like that, those generally query once every second, and there's a known driver bug that makes that querying cause severe performance issues when the GSP firmware is used.
Comment 4 Martel Theo 2024-10-05 16:34:47 UTC
If someone else stumbles upon this report, I have found the origin of the problem : I have installed a plasmoid (https://github.com/davidhi7/ddcci-plasmoid) to manage the brightness of my external display through DDC/CI. This plasmoid polls every minute (the frequency of the spikes I saw, my graphs should read "time (min)" instead of "time (s)") for the presence of DDC/CI compatible screens, and this polling seems to heavily affect framerate .