Summary: | [Wayland] xdg-desktop-portal-kde using extremely high amount of GPU memory. | ||
---|---|---|---|
Product: | [Plasma] xdg-desktop-portal-kde | Reporter: | oroborius <oroborius> |
Component: | general | Assignee: | Plasma Bugs List <plasma-bugs-null> |
Status: | CONFIRMED --- | ||
Severity: | normal | CC: | aleixpol, dannkunt, kde, kubry, lg096066587039, mynameislich, renari, sitter, thecookie94 |
Priority: | NOR | ||
Version First Reported In: | 6.2.91 | ||
Target Milestone: | --- | ||
Platform: | Arch Linux | ||
OS: | Linux | ||
Latest Commit: | Version Fixed In: | ||
Sentry Crash Report: | |||
Attachments: |
Display information from fastfetch, nvidia-smi and kinfo
Leak on non-nvidia gpu |
Comment on attachment 174434 [details] Display information from fastfetch, nvidia-smi and kinfo I'm sorry I accidentally pasted the wrong image in the bottom left, I forgot to hit copy in spectacle, this one has the fastfetch info...: https://i.imgur.com/2dd0dK9.png I see you also have a very high OBS memory load. Maybe what we see here is that there's shared memory in use but it's not really a leak per se. Does the memory load go back down when you close OBS? It does not. In fact as of now since the time of posting this bug report my GPU memory usage from xdg-desktop-portal-kde alone is at 12935MiB even with it closed. Also attempted to restart the DE using `plasmashell --replace` but no dice. It does clear it after a logout however. I'm going to continue to monitor it and see if I notice anything. So as of now I've done a whole stream using OBS and a plethora of programs through bottles and proton, the idle VRAM reserved is 2960MiB. It is a definitely slow but consistent process. Currently at 3730MiB over the course of nearly 8 day uptime. My wager is that it's definitely related to one of these softwares: OBS (tytan652), Bottles (Specifically for Steam and/or Warudo for VTubing), Proton in general. The first day post restart, using OBS the usage went to around 2500 MiB however had been slowly increasing over time (like 100MiB every couple of hours), It was sitting at about 2500 MiB post stream using these software but over the course of a few days post it has increased about 1500 MiB about to what it is now. Something is definitely leaking. Probably something related to these bits of software maybe probing the package to keep the VRAM usage? Not sure if there is a way to clear it without fully logging out of the system however. Can replicate on plasma 6.2.91 (tested after a user in the OBS community made me aware of this just now. Steps to reproduce: 1: launch OBS 2: have the vram monitoring tool of your choice open so that one can watch the use grow (in my case that would be amdgpu-top as I'm running an AMD gpu) 2:select a window to capture 3: switch the capture (in obs) to capture a different application 4: take another look at the VRAM monitor, usage will have increased 5: rinse&repeat up until one runs out of VRAM My assumption here is that the previous capture doesn't get released properly when switching it to a different application, leading to more and more VRAM being allocated by "zombie captures" as the memory just doesn't get released System specs: ArchLinux Plasma 6.2.91 (ie 6.3 beta) KDE Frameworks 6.10 QT 6.9 Kernel 6.12.10-arch1-1 Wayland Graphics card would be accidentally pressed enter, this is a continuation of the last msg graphics card would be a 5700xt /w the mesa driver stack, so this most def isn't nvidia only or anything of those sorts Still an issue on plasma 6.3 and plasma 6.3.90 I don't think this is related to OBS as I've had this happen streaming to Discord as well. Also I don't think this is nvidia-smi reporting shared memory usage because the Nvidia drivers don't support shared memory: https://forums.developer.nvidia.com/t/non-existent-shared-vram-on-nvidia-linux-drivers/260304 I've managed to reliably reproduce the VRAM leak and have a much clearer picture of what's happening. The key issue is that VRAM usage spikes every time the ScreenChooserDialog is opened and then closed, even when cancelling the request. The leaked memory is never reclaimed. I recorded a short video to show exactly what's happening. The nvtop graph on the left clearly shows the VRAM spikes corresponding to my actions. You'll notice a jump when the dialog opens, and then a second, much larger jump, the moment it closes. This is repeatable and the memory usage just keeps climbing. https://youtu.be/ho1MU_f69ew To figure out why, I added some qCDebug prints to the ScreencastingStream constructor and destructor in the xdg-desktop-portal-kde master branch. The logs confirmed my suspicion: while many ScreencastingStream objects are created for the previews when the dialog opens, their destructors are never called when the dialog is closed. The C++ objects managing the streams are being leaked. I'm now fairly certain the root cause is an architectural issue with object ownership. It seems the ScreencastingStream objects for the previews are being parented to the global Screencasting singleton within the portal, instead of the QML components in the dialog that request them. Because of this, when the ScreenChooserDialog is destroyed, the C++ stream objects are orphaned and never garbage collected. This also explains the strange two-stage leak. The initial smaller leak on dialog open is from the stream object initialization. The larger leak on dialog close seems to be an asynchronous race condition: KWin finishes preparing the "heavy" video buffers and sends them to the portal after the dialog has already been closed. The portal receives these resources, but the QML components that were supposed to handle them are already gone. With no context to manage them, these buffers get stranded in VRAM. So, the core problem is that the lifecycle of the preview stream resources is not tied to the lifecycle of the ScreenChooserDialog. The fix likely requires ensuring that the component requesting a preview stream (I believe it's TaskManager.ScreencastingRequest in plasma-workspace) is also responsible for explicitly destroying that stream when it is itself destroyed. Created attachment 183053 [details]
Leak on non-nvidia gpu
Also, can reproduce on my amd laptop
I just encountered this bug on Ubuntu 25.04, from using Zoom in firefox. This is a serious memory leak because destructors are not called. As far as I remember the ScreenCastingStream class is only used for streams for sessions. The preview dialog previews should fully be handled by Libtaskmanager via qml. So I am a bit surprised by that finding. The streams for apps should have the lifetime of the session. Well, you can't see a destructor getting called of something that hasn't been constructed :D David has this right. The leak is probably in kpipewire, I see it cleans resources in a QRunnable... or tries to at least. |
Created attachment 174434 [details] Display information from fastfetch, nvidia-smi and kinfo SUMMARY xdg-desktop-portal-kde uses an absurd amount of VRAM as reported by nvidia-smi. I had noticed this only today and my system has been running for a while. Might be related to freezes during running games under proton and software under wine? Not sure. nvidia-smi reports 9383MiB. Will attach some images for system information. STEPS TO REPRODUCE 1. Run DE under Wayland in Arch Linux 2. Do whatever you do over time (may take a few days?) 3. Periodically check VRAM usage using nvidia-smi OBSERVED RESULT Extremely high VRAM usage (check attachment) EXPECTED RESULT A more appropriate amount of VRAM. SOFTWARE/OS VERSIONS Linux/KDE Plasma: Arch Linux (6.10.10), KDE Plasma KDE Plasma Version: 6.1.5 KDE Frameworks Version: 6.6.0 Qt Version: 6.7.3 ADDITIONAL INFORMATION Running under Wayland, system has been running for days.