SUMMARY I'm not sure where I should report this (nvidia, kwin, plasmashell or somewhere else). Every notification, opening/closing plasmoids cause a lot of sync_file leaks in plasmashell: ❯ lsof -p $(pidof plasmashell) ... 396r a_inode 0,16 0 1062 sync_file 397r a_inode 0,16 0 1062 sync_file 399r a_inode 0,16 0 1062 sync_file 400r a_inode 0,16 0 1062 sync_file And plasmashell eventually crashes with: plasmashell[2053]: error marshalling arguments for get_icon: dup failed: Too many open files plasmashell[2053]: Error marshalling request: Too many open files plasmashell[2053]: The Wayland connection experienced a fatal error: Too many open files plasma-plasmashell.service: Main process exited, code=exited, status=255/EXCEPTION If I set __NV_DISABLE_EXPLICIT_SYNC=1 in /etc/environment this doesn't happen. STEPS TO REPRODUCE 1. Open/close Kickoff multiple times OBSERVED RESULT sync_file leaks in lsof -p $(pidof plasmashell) EXPECTED RESULT No leaks SOFTWARE/OS VERSIONS Windows: macOS: (available in the Info Center app, or by running `kinfo` in a terminal window) Linux/KDE Plasma: KDE Plasma Version: 6.2.80 KDE Frameworks Version: 6.9.0 Qt Version: 6.8.1 ADDITIONAL INFORMATION nvidia driver: 565.77
Please report this at https://forums.developer.nvidia.com/c/gpu-graphics/linux/
Bug it still present and it's told to be KDE fault, not NVidia. So please fix it. And I agree with what's told in NVidia forum, that plasmashell had memory leaks for years and no one does anything to fix this. So please fix it at least for KDE 6.3. Except buggy KDE's plasma shell as owner of NVidia gpu I insist there are no any sort of such bugs anywhere else happening except plasma shell + no any logs generated by your software which would be usefull
Plasmashell doesn't have anything to do with explicit sync. This is a driver bug, the fact that noone from NVidia has looked at it is annoying but there is nothing we can do about it.
There must be simply correct driver use. And people say intel has freeze issues too. + for a long time plasma shell had memory leaks. So investigation instead of just dropping and closing would be welcomed. Would be good to fix finally. And as owner of nvidia GPU I can tell that all the rest works ideally, even games over wine like World of Tanks and nothing crashes/freezes.
Created attachment 179017 [details] A script to repeatedly send notifications, resulting in plasmashell crashing. I experience regular crashes due to my notification-heavy workflow before finding this bug. It appears like plasmashell is leaking descriptors when using the NVIDIA driver with explicit sync. I don't know whether it's NVIDIA or Plasma that's responsible, but I've attached a script that easily trigger this crash. Maybe it will help someone identify the root cause. This is a partial output from the script: [user@fedroa-pc:[~]> ./leak.sh Explicit sync is enabled. Descriptors should leak. Notification PID Limit Open descriptors Until limit ------------ ----- ----- ---------------- ----------- 1 2563 1024 157 867 2 2563 1024 157 867 3 2563 1024 168 856 4 2563 1024 177 847 [snip] 216 2563 1024 1016 8 217 2563 1024 1020 4 218 2563 1024 1017 7 219 2563 1024 1024 0 plasmashell crashed after 219 notifications Mar 01 12:45:13 fedroa-pc plasmashell[2563]: qt.qpa.wayland: eglSwapBuffers failed with 0x3000, surface: 0x55db036bdf90 Mar 01 12:45:13 fedroa-pc plasmashell[2563]: qt.qpa.wayland: eglSwapBuffers failed with 0x3000, surface: 0x55db036bdf90 Mar 01 12:45:13 fedroa-pc plasmashell[2563]: qt.qpa.wayland: eglSwapBuffers failed with 0x3000, surface: 0x55db03438840 Mar 01 12:45:13 fedroa-pc plasmashell[2563]: qt.qpa.wayland: eglSwapBuffers failed with 0x3000, surface: 0x55db03438840 Mar 01 12:45:14 fedroa-pc plasmashell[2563]: error marshalling arguments for import_timeline: dup failed: Too many open files Mar 01 12:45:14 fedroa-pc plasmashell[2563]: Error marshalling request: Too many open files Mar 01 12:45:14 fedroa-pc plasmashell[2563]: qt.qpa.wayland: eglSwapBuffers failed with 0x3000, surface: 0x55db03f04800 Mar 01 12:45:14 fedroa-pc plasmashell[2563]: qt.qpa.wayland: eglSwapBuffers failed with 0x3000, surface: 0x55db03f04800 Mar 01 12:45:14 fedroa-pc plasmashell[2563]: The Wayland connection experienced a fatal error: Too many open files Mar 01 12:45:14 fedroa-pc systemd[1983]: Starting grub-boot-success.service - Mark boot as successful... Mar 01 12:45:14 fedroa-pc systemd[1983]: Finished grub-boot-success.service - Mark boot as successful. Mar 01 12:45:14 fedroa-pc systemd[1983]: plasma-plasmashell.service: Main process exited, code=exited, status=255/EXCEPTION Mar 01 12:45:14 fedroa-pc systemd[1983]: plasma-plasmashell.service: Failed with result 'exit-code'. Mar 01 12:45:14 fedroa-pc systemd[1983]: plasma-plasmashell.service: Consumed 23.288s CPU time, 331.5M memory peak. As a workaround, increase plasmashell's open file limit and setting a large `LimitNOFILE` value: > vim systemctl edit --user plasma-plasmashell.service [Service] # https://access.redhat.com/solutions/1257953 LimitNOFILE=50000 Save the file and log out. ---- Operating System: Fedora Linux 41 KDE Plasma Version: 6.3.2 KDE Frameworks Version: 6.11.0 Qt Version: 6.8.2 Kernel Version: 6.13.5-200.fc41.x86_64 (64-bit) Graphics Platform: Wayland Processors: 16 × 11th Gen Intel® Core™ i7-11850H @ 2.50GHz Memory: 62.6 GiB of RAM Graphics Processor: NVIDIA RTX A3000 Laptop GPU/PCIe/SSE2 NVIDIA Driver Version: 570.124.04
Just in case, I should add that the above workaround is not necessarily safe. From the systemd docs: https://www.freedesktop.org/software/systemd/man/latest/systemd.exec.html#Process%20Properties See Table 1: "Do not use. Be careful when raising the soft limit above 1024, since select(2) cannot function with file descriptors above 1023 on Linux... Typically applications should increase their soft limit to the hard limit on their own, if they are OK with working with file descriptors above 1023, i.e. do not use select(2). " I'll leave it to a KDE developer to inform us whether this actually effects plasma (does plasma use select()?), and possibly consider implementing the advised means to appropriately modify the limit itself (is there any desire to put this workaround in plasma to prevent crashes temporarily until nvidia fix the driver?).