Bug 479538 - Some kind of always-accessible troubleshooting tool when parts of the system are unresponsive
Summary: Some kind of always-accessible troubleshooting tool when parts of the system ...
Status: CONFIRMED
Alias: None
Product: kwin
Classification: Plasma
Component: general (show other bugs)
Version: master
Platform: Fedora RPMs Linux
: NOR wishlist
Target Milestone: ---
Assignee: KWin default assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-01-08 12:42 UTC by Henning
Modified: 2024-02-28 22:11 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
the requested debug info (7.44 KB, text/plain)
2024-02-12 13:42 UTC, Henning
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Henning 2024-01-08 12:42:59 UTC
I can't count how many times Plasma just froze, because for example a VM took too much RAM.

On Windows there is Ctrl+Alt+Del which simply always works. Also it seems the shell is privileged and doesnt just freeze if a window freeze.

I had Dolphin, Ark, kio, virt-manager (qemu, kvm), Firefox, krunner, and many more hang and make my plasma shell unresponsive.

The task manager / system activity is not privileged, its just a regular App. It is not about "how to have fun when everything is working", but it needs to be able to always launch (reserved memory space?) and kill processes that are corrupt, need too much RAM etc.
Comment 1 Harald Sitter 2024-01-09 11:34:48 UTC
I don't know what you mean by privileged?

Also, shouldn't OOM conditions be handled by -- an OOM handler? earlyoom for example
Comment 2 Henning 2024-01-10 23:01:31 UTC
So agree that oom handlers i.e. the Distros are responsible for  a part of it.

But the Desktop should then be treated differently, so that it has higher privileges in the oom handler.

Many Distros, like Fedora, use systemd-oomd. How could plasma be fixed, that the shell, the panel etc ALWAYS work until the complete end of RAM?

Currently, a program crashes and takes the entire shell with it, which seems totally wrong.

Also, seperating a "task manager" which can kill running programs, from the rest, so that it always works, wouldnt that be task of the Desktop too?
Comment 3 Harald Sitter 2024-01-11 12:24:24 UTC
I still do not know what you mean by privileged?

> Also, seperating a "task manager" which can kill running programs, from the rest, so that it always works, wouldnt that be task of the Desktop too?

Separating from what? If you are out of memory you are out of memory. The mechanism by which this situation is handled is OOM handling, not a task manager.
Comment 4 Nate Graham 2024-01-18 00:45:22 UTC
I think what he means is that somehow, you should be able to show a GUI window capable of killing processes even when the rest of the system is unresponsive. The reference so Ctrl+Alt+Delete is a Windows-ism; on Windows, hitting this keyboard shortcut always shows a troubleshooting screen from which you can open the Task Manager even when everything else is frozen.

My wife has expressed a similar desire in response to app and system freezes in the past. I think it's a reasonable thing to have, in some capacity at least.

This is probably not something that would be done in plasmashell though; moving to KWin.
Comment 5 Harald Sitter 2024-01-18 08:28:22 UTC
I see. How would that work though?
Comment 6 Nate Graham 2024-01-19 00:31:17 UTC
Some kind of low-level always running helper I guess. It could respond to a Ctrl+Alt+Delete (or whatever) and spawn a new kwin_wayland instance and System monitor. Or something like that.
Comment 7 Harald Sitter 2024-01-19 12:47:19 UTC
(In reply to Nate Graham from comment #6)
> and spawn 

Right, that's the problem. We cannot spawn anything, the system is out of memory. We cannot do anything that would allocate any new memory. That likely includes shortcut handling, which would probably do allocations somewhere under the hood. That is on top of the fact that the kernel also must allow a theoretical task manager to even do any work, which I am not convinced it will since it the kernel is busy trying to fix the OOM situation.
Comment 8 Nate Graham 2024-01-19 17:08:36 UTC
Right, if we have to launch something new, it wouldn't help in the OOM case. But it would help in the case of a graphical freeze.
Comment 9 Harald Sitter 2024-01-19 17:10:08 UTC
(In reply to Nate Graham from comment #8)
> But it would help in the case of a graphical freeze.

Isn't that covered by Ctrl-Esc? Or, it should anyway ^^
Comment 10 Nate Graham 2024-01-19 18:07:22 UTC
I've had freezes where the window killer UI didn't appear. Probably because it was KWin that was frozen.
Comment 11 Harald Sitter 2024-01-19 18:25:28 UTC
When kwin (wayland) freezes we can't display anything I think.
Comment 12 Nate Graham 2024-01-19 19:07:04 UTC
We could kill it and spawn a new one.
Comment 13 Harald Sitter 2024-01-19 19:39:14 UTC
We don't get keyboard input if kwin is stuck, do we?
Comment 14 Harald Sitter 2024-01-19 19:48:27 UTC
On second thought, we might get them from libinput directly (assuming appropriate permissions on the input devices) so we'd have to write some sort of helper that detects a shortcut and then restarts kwin. I am not sure a shortcut to restart kwin is all that meaningful though?
Comment 15 Henning 2024-01-29 03:11:00 UTC
hey, yes @Nate that is what I meant. A process that allows opening a window that graphically kills apps.

Maybe this would be an extra KWin with the interface and just launching this program? I dont know how Windows does it, but it always works, which is really needed.

Also I dont understand why the Plasmashell would freeze if Firefox freezes. Why isnt the system more privileged?
Comment 16 Henning 2024-02-05 15:41:26 UTC
for example I am currently building Firefox from source, and even before any compilation the damn Mercurial is using 23% of CPU and the Plasmapanel is just not reacting. 

This is what I mean by "plasmashell is randomly freezing on load"
Comment 17 Harald Sitter 2024-02-05 17:48:06 UTC
You should take that up with Fedora. It sounds like your IO scheduler is ill equipped to deal with your work load.
Comment 18 Harald Sitter 2024-02-05 18:20:09 UTC
Talking about, what are your schedulers?

`tail -n +1 /sys/block/*/queue/scheduler`
Comment 19 Harald Sitter 2024-02-05 19:11:10 UTC
Also, are you on wayland or x11?

And what's the output of `kinfo`?
Comment 20 Henning 2024-02-07 00:53:26 UTC
[none] mq-deadline kyber bfq 

Operating System: Fedora Linux 39
KDE Plasma Version: 5.27.10
KDE Frameworks Version: 5.113.0
Qt Version: 5.15.12
Kernel Version: 6.6.14-200.fc39.x86_64 (64-bit)
Graphics Platform: offscreen
Processors: 4 × AMD Ryzen 5 PRO 3500U w/ Radeon Vega Mobile Gfx
Memory: 5.7 GiB of RAM
Graphics Processor: AMD Radeon Vega 8 Graphics

systemd-oomd-defaults-254.8-2.fc39.noarch

thanks!
Comment 21 Harald Sitter 2024-02-07 12:03:15 UTC
Wayland or X11?

Please paste the entire output of the tail command.
Comment 22 Harald Sitter 2024-02-07 12:03:26 UTC
.
Comment 23 Harald Sitter 2024-02-07 12:05:58 UTC
The output of mount may also be useful
Comment 24 Henning 2024-02-08 21:21:54 UTC
sorry, I should have used my own sysinfo tool, kinfo seems to have a bug. Wayland, always.

what tail command, I didnt use any?
Comment 25 Harald Sitter 2024-02-09 11:08:15 UTC
Please post the output of 

tail -n +1 /sys/block/*/queue/scheduler
tail -n +1 /sys/block/*/queue/rotational
ls  /sys/block/*/mq
mount
lsblk
sudo lshw -class disk -class storage
grep -C 5  -r queue/scheduler /usr/lib/udev

When your system is idle what is the output of 

iostat

When you clone firefox what is the output of

iostat
Comment 26 Henning 2024-02-12 13:42:09 UTC
Created attachment 165779 [details]
the requested debug info
Comment 27 Harald Sitter 2024-02-12 14:17:39 UTC
When your system is idle what is the output of 

iostat

When you clone firefox what is the output of

iostat
Comment 28 Henning 2024-02-13 22:00:34 UTC
iostat is not installed on my system but I can layer it.
Comment 29 Bug Janitor Service 2024-02-28 03:46:32 UTC
Dear Bug Submitter,

This bug has been in NEEDSINFO status with no change for at least
15 days. Please provide the requested information as soon as
possible and set the bug status as REPORTED. Due to regular bug
tracker maintenance, if the bug is still in NEEDSINFO status with
no change in 30 days the bug will be closed as RESOLVED > WORKSFORME
due to lack of needed information.

For more information about our bug triaging procedures please read the
wiki located here:
https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging

If you have already provided the requested information, please
mark the bug as REPORTED so that the KDE team knows that the bug is
ready to be confirmed.

Thank you for helping us make KDE software even better for everyone!
Comment 30 Henning 2024-02-28 22:09:06 UTC
I did not have issues like this. This is a "whishlist" Feature request though so I dont think it is relevant.
Comment 31 Henning 2024-02-28 22:11:01 UTC
sorry, I meant "I didnt have situations like this since reporting the FR", at least not well reproducible. I will use iostat if I find such a situation, but in general such a tool is really really needed for random edge cases where bugs simply occur.