SUMMARY Plasmashell rapidly consumes more and more memory if the wallpaper is set to slideshow and the interval is very short. The folder it's cycling through at random contains 35 PNG images with an average size of 1.6 MiB. The shorter the interval, the more rapidly plasmashell eats up memory. It does not seem to occur when the screen is locked. STEPS TO REPRODUCE 1. Set wallpaper type to slideshow, probably with multiple images to cycle through. 2. Set "Change every:" to a low value, use 1 second to very quickly see the effect. 3. Use the computer actively for anything. OBSERVED RESULT Plasmashell consumes more and more memory until it either restarts due to out of memory crash or the service is manually restarted. EXPECTED RESULT Memory usage stays under control regardless of the slideshow interval. SOFTWARE/OS VERSIONS Linux/KDE Plasma: Arch Linux KDE Plasma Version: 6.4.4 KDE Frameworks Version: 6.17.0 Qt Version: 6.9.2 ADDITIONAL INFORMATION I became suspicious about the slideshow wallpaper feature causing issues after searching online, where I came across bug #403563 mentioning something about it. Increasing the interval to something like 10 minutes immediately improved the issue for me, but it just seems to consume memory a lot slower. Setting the wallpaper to use a single image instead of a slideshow seems to not cause this issue at all. I initially had my slideshow interval set to 10-20 seconds and the memory issue became problematic at around 1 hour of usage with 8 GiB of RAM and 8 GiB of swap on my system. I restarted the plasmashell service to run a test. It started out at ~300 MiB of memory usage. I then set the slideshow interval to 1 second and after about 2-3 minutes plasmashell took up ~2 GiB of memory, at which point I stopped the test by setting the wallpaper to just one image. The memory immediately stopped increasing at that point, but after another 10 minutes plasmashell still had not decreased in memory usage.
Do you have an NVIDIA GPU?
(In reply to Nate Graham from comment #1) > Do you have an NVIDIA GPU? I do on the PC in question, it has a GTX 1650 SUPER, using the latest available (non-testing) proprietary driver package for Arch. I just ran the 1 sec test on my other PC with a AMD Radeon RX 9070 XT using the same folder of images, it ate up at most 300 MiB of more RAM before stabilizing. Stopping the test and going back to the static wallpaper does not seem to free up the extra memory used on this though, at least it hasn't gone back down to the initial 350 MiB range within the 30 minutes I've been watching it. I do have a third PC with a GTX 1080 Ti, running up to date Arch Linux with the same drivers and KDE Plasma. Haven't tested on it yet, I'll do a quick test the next time I use it, but I'm pretty sure by now that this is indeed some NVIDIA related issue. I have no idea whether this is an issue that can/needs to be fixed on the KDE or NVIDIA drivers side of things, I'm pretty new to Linux in general and just doing my best to be helpful. If there's anything else I can do to narrow down this issue then please do suggest it to me.
Tested this again on the previously mentioned PC with a 1080 Ti. RAM usage remained stable for longer than with the 1650 SUPER, but does eventually increase unreasonably after a few more minutes (at 1 sec change).
Cannot reproduce the leak on a RTX 3050 with nvidia-open on a git build of plasma. It rises a bit and then remains stable. It's curious that the same driver produces different results on two different GPUs though. Can you maybe try with a new user on the affected system?
Created attachment 185048 [details] memory allocations Ha! I managed to somewhat reproduce this and I am reasonably certain it's some problem with the driver. What appears to happen is that after a while (unclear what triggers it) the GPU memory starts filling up (you can observe this using nvtop). After exactly 10 minutes uptime the driver then allocates a bunch of system memory. It will continue to do so exactly 5 minutes after the last allocation. This appears to happen as a result of QWaylandGLContext::swapBuffers but I am rather thinking it's simply the trigger for whatever happens internally in the driver source. I am going to attach a heaptrack graph of the memory consumption. We can very clearly see the consumption spikes every 5 minutes. <unresolved function> in libnvidia-eglcore.so.580.82.09 <unresolved function> in libnvidia-eglcore.so.580.82.09 <unresolved function> in libEGL_nvidia.so.0 <unresolved function> in libEGL_nvidia.so.0 <unresolved function> in libEGL_nvidia.so.0 <unresolved function> in libEGL_nvidia.so.0 wlEglSwapBuffersWithDamageHook in libnvidia-egl-wayland.so.1 <unresolved function> in libEGL_nvidia.so.0 <unresolved function> in libEGL_nvidia.so.0 QtWaylandClient::QWaylandGLContext::swapBuffers(QPlatformSurface*) in libQt6WaylandEglClientHwIntegration.so.6 QRhiGles2::endFrame(QRhiSwapChain*, QFlags<QRhi::EndFrameFlag>) in libQt6Gui.so.6 QRhi::endFrame(QRhiSwapChain*, QFlags<QRhi::EndFrameFlag>) in libQt6Gui.so.6 QSGGuiThreadRenderLoop::renderWindow(QQuickWindow*) in libQt6Quick.so.6 QWindow::event(QEvent*) in libQt6Gui.so.6 <unresolved function> in plasmashell QApplicationPrivate::notify_helper(QObject*, QEvent*) in libQt6Widgets.so.6 QCoreApplication::notifyInternal2(QObject*, QEvent*) in libQt6Core.so.6 QGuiApplicationPrivate::processExposeEvent(QWindowSystemInterfacePrivate::ExposeEvent*) in libQt6Gui.so.6 QWindowSystemInterface::sendWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) in libQt6Gui.so.6 QWindowSystemInterface::flushWindowSystemEvents(QFlags<QEventLoop::ProcessEventsFlag>) in libQt6Gui.so.6 QObject::event(QEvent*) in libQt6Core.so.6 QApplicationPrivate::notify_helper(QObject*, QEvent*) in libQt6Widgets.so.6 QCoreApplication::notifyInternal2(QObject*, QEvent*) in libQt6Core.so.6 QCoreApplication::sendEvent(QObject*, QEvent*) in libQt6Core.so.6 QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*) in libQt6Core.so.6 QCoreApplication::sendPostedEvents(QObject*, int) in libQt6Core.so.6 postEventSourceDispatch in libQt6Core.so.6 g_main_dispatch in libglib-2.0.so.0 g_main_context_iterate_unlocked::g_main_context_dispatch_unlocked in libglib-2.0.so.0 g_main_context_iterate_unlocked in libglib-2.0.so.0 g_main_context_iteration in libglib-2.0.so.0 QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) in libQt6Core.so.6 QEventLoop::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) in libQt6Core.so.6 QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) in libQt6Core.so.6 QCoreApplication::exec() in libQt6Core.so.6 <unresolved function> in plasmashell __libc_start_call_main in libc.so.6 __libc_start_main_impl in libc.so.6 <unresolved function> in plasmashell
*** Bug 509310 has been marked as a duplicate of this bug. ***
Our theory right now is that something (possibly the driver itself) leaks into GPU memory and then the driver continuously moves the (unused) leaked pages from GPU memory to system memory to make space. This then presents the leak in system memory.
One more thing that is perhaps relevant here is that, for me, this happens when I put a mediaframe plasmoid on my desktop, but not on my lock screen, which is configured to show a slide show of pictures from a folder.
Indeed I also noticed this doesn't happen when used in plasmawindowed, so there's probably an environmental trigger inside plasmashell. Somewhat unfortunate because that makes finding the cause all the more difficult.
More observations: this only appears to happen with the basic scenegraph loop, the threaded loop does not leak (but has other problems). Also for some reason I cannot get it to leak outside the originally started plasmashell. If I restart plasmashell it no longer leaks even with the same scenegraph loop.
That is unlike my experience. It starts leaking also after restarting plasmashell manually. I haven't had the time to do any more testing, but from the testing I did memory use seemed to increase more steadily in between the larger spikes. I was just looking at the values though, so I might be wrong. I'll see if I can figure out how to use that heaptrack tool to visualize it. One possible difference is the Nvidia driver, I'm not using the nvidia-open driver because it's not supported, at least not on the 1080 Ti. I'm using the proprietary driver.
I did a quick test and I confirm that restarting plasmashell stopped the VRAM usage from increasing. However, I am pretty confident that, on at least some of my previous test runs, restarting plasmashell brought the process VRAM usage back to baseline but did not stop it from increasing again, instead it started climbing back immediately. So I would not take it for granted that restarting plasmashell once 'resolves' the issue.
I think we found what's causing this. Nvidia uses the basic rendering loop for various reasons. in basic render mode, cleanup entirely hinges on QSGContext::endSync getting called. this would happen here if (lastDirtyWindow) data.rc->endSync(); except we already see it can maybe not... bool lastDirtyWindow = true; for (auto it = m_windows.cbegin(), end = m_windows.cend(); it != end; ++it) { if (it->updatePending) { lastDirtyWindow = false; break; } } m_windows is a hash of all open windows... including hidden windows... hidden windows never get painted... their updatePending state will never flip... and by extension endSync never gets called the most trivial way to trigger this is to simply open kickoff. when it was opened once it will remain around as a hidden window and block memory cleanup. that also explains why it looked like a restarted plasmashell doesn't leak memory - it only starts when (e.g.) kickoff got used once.
Oh and fortunately that now gives us a way to reproduce this on AMD as well, by forcing the basic loop - QSG_RENDER_LOOP=basic plasmashell --replace - add media frame to desktop - configure it to use a dir with a bunch of images (e.g. /usr/share/wallpapers) - set its behavior to refresh every second - open kickoff and close it again - observe GPU leak in nvtop
I was about to write that, after that earlier plasmashell restart and after leaving my slide shows running while I was doing other stuff on the machine, VRAM usage suddently surged and started increasing steadfastly again. This behavior seems consistent with what you just described :-)
With some more investigation we are reasonably certain this was fixed in Qt 6.10, should be rolling out soonish to distros. Specifically this commit undoes the concerning code and should fix the ultimate cause of the leak in a fairly comprehensive fashion https://invent.kde.org/qt/qt/qtdeclarative/-/commit/8c54efe9fb007701eff6c5caad3c4ee54c714dc5 I'll switch the bug into needsinfo state until someone on nvidia can verify Qt 6.10 indeed resolves the leak.
*** Bug 403563 has been marked as a duplicate of this bug. ***
I received an update to Qt 6.10 today and after updating I tested in the same way as I previously have, with the extreme 1 second interval wallpaper change. I'm happy to report that VRAM and RAM usage is stable beyond 20 minutes on my system with a GTX 1080 Ti, whereas before there would at least be significant VRAM usage by then. Both VRAM usage as reported by nvtop and RAM usage as reported by System Monitor stabilized around 500 MiB, which is roughly the same as when I tested on my AMD GPU system. My kids will be pleased to view pictures on our TV in a frequency that satisfies their short attention span and I will be pleased to have a stable system. Although this turned out to not be a bug in KDE Plasma I thank you for taking it seriously and applying your skills to try to figure this out.
That's wonderful! Thanks so much for confirming the fix.
I finally got my QT 6.10 update as well. I reran the same tests as before (see #509310) and can confirm that the memory leak seems to be gone. Thanks for helping sorting this out!