SUMMARY While the system is trying to fix an oom situation there is a good chance (so long as the system is responsive enough) that the systemd unit watchdog triggers and terminates kwin_wayland, thereby nuking the session. Apr 04 13:28:07 ajax systemd[1007]: plasma-kwin_wayland.service: Watchdog timeout (limit 15s)! Apr 04 13:28:07 ajax systemd[1007]: plasma-kwin_wayland.service: Killing process 1133 (kwin_wayland_wr) with signal SIGHUP. ... Apr 04 13:28:11 ajax systemd[1007]: plasma-kwin_wayland.service: Failed with result 'watchdog'. Apr 04 13:28:11 ajax systemd[1007]: Stopped KDE Window Manager. STEPS TO REPRODUCE 1. run out of memory. have many VMs, or a leaky app, or just too much stuff open 2. system becomes slightly unresponsive OBSERVED RESULT you get thrown to sddm because your session has been stopped by the watchdog EXPECTED RESULT oom handling should be allowed to take place and kill a suitable client. note the additional info below though SOFTWARE/OS VERSIONS KDE Plasma Version: 6.0.80 KDE Frameworks Version: 6.1.0 Qt Version: 6.6.3 Kernel Version: 6.8.2-arch2-1 (64-bit) Graphics Platform: Wayland Processors: 12 × AMD Ryzen 5 3600X 6-Core Processor Memory: 31.2 GiB of RAM Graphics Processor: AMD Radeon RX 5700 XT ADDITIONAL INFORMATION In a way this is actually useful because it likely terminates what was causing the oom situation in the first place, not sure if we can make this a feature somehow. Maybe instead of terminating, wildly shoot at clients on SIGHUP? But then I suppose we'd be implementing yet another user space oom handler. Indeed perhaps we can consider this a feature? If the oom handler hasn't been able to fix the situation in 15 seconds it probably won't for a lot longer. That said, we definitely need to tell the user what happened.
If the system is frozen for so long, then systemd killing kwin is the right thing to do. There's not much else we can do in regards to this bug is there?
Like I say, if we consider this a feature we need to tell the user that we just restarted the compositor. In particular since we cannot assume all clients will survive the reset.
Ack, notification is easy. Lets change scope to that.