Bug 389028 - KDE randomly locks up
Summary: KDE randomly locks up
Status: RESOLVED NOT A BUG
Alias: None
Product: plasmashell
Classification: Plasma
Component: Containment (other bugs)
Version First Reported In: 5.8.8
Platform: Mint (Debian based) Linux
: NOR crash
Target Milestone: 1.0
Assignee: Sebastian Kügler
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-15 22:01 UTC by bkorb
Modified: 2018-01-25 23:15 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed/Implemented In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description bkorb 2018-01-15 22:01:08 UTC
This has gone on for a while through some revisions of KDE, so now I am reporting it. It happens at random times, the only consistency is that browsers are running Java script.

Symptoms:
* mouse works
* audio will continue until the buffered data are exhausted
* Nothing else apparently operates
* reboot hot keys do not work. Not even the power button signal to reboot. (Hold it long enough and power gets pulled, but I'm trying to trigger an "init 6" or "init 0".)
* ssh does work and "init 4" followed by "init 5" will successfully restart the desktop. Unfortunately, SSH is considered a security hole and it is a laborious task to figure out how to re-enable it with each OS update.

In the end, I press "reset". There is a left over glitch in that the task bar will not auto-hide and changing its setting does no good. After several reboots, the task bar will go back to auto-hiding.

========

I would be more than happy to help diagnose the problem. I am willing to re-enable ssh and ssh in to try to determine what is causing KDE to hang up. Just tell me what you need me to do.  Thank you.
Comment 1 bkorb 2018-01-16 16:29:27 UTC
Oh, wait, another point: I actually can get this to happen regularly. Yesterday, it happened while I was actively using the system. However, if I leave the computer and come back a half hour later, somewhat more likely than not, it will be locked up. I can prevent that, of course, by typing "init 4" first and then return and type "init 5", but this should not be happening.
Comment 2 David Edmundson 2018-01-20 21:35:04 UTC
and you don't get this with openbox.
Anything in "dmesg" after it freezes?
Comment 3 bkorb 2018-01-25 00:20:44 UTC
dmesg resets itself after a reboot. I currently can not ssh in to run dmesg because sshd started refusing connections for reasons beyond me to explain.

Anyway, /var/log/kern.log contains these 5 entries before the last lockup (8 minutes ago):

    Jan 24 16:07:35 bach kernel: [28972.823404] nouveau 0000:01:00.0: gr: TRAP ch 8 [007f6be000 chrome[2903]]
    Jan 24 16:07:35 bach kernel: [28972.823428] nouveau 0000:01:00.0: gr: GPC0/TPC0/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 3f000d [OOR_REG]
    Jan 24 16:07:35 bach kernel: [28972.823437] nouveau 0000:01:00.0: gr: GPC0/TPC1/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 3f000d [OOR_REG]
    Jan 24 16:07:39 bach kernel: [28977.119555] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
    Jan 24 16:07:39 bach kernel: [28977.119564] nouveau 0000:01:00.0: fifo: gr engine fault on channel 2, recovering...

which seems to indicate some new information for me: Chrome might be triggering a graphics engine issue. I'll do some Googling to see what might be amiss. Still, if KDE can figure a way to capture some four fingered salutes, it would be nice. (Ctl-Alt-Shift-PageUp/PageDown/SomethingOrAnotherToSayRestartKDE).

I'll post something after my Google session.
Comment 4 bkorb 2018-01-25 00:37:35 UTC
It has been known for at least 2 years.
Here's a link to RedHat:

https://bugs.freedesktop.org/show_bug.cgi?id=93629

the cause is still unknown, but I'm guessing if I dump the "nouveau" c**p, and use Nvidia directly instead, I'll be happy again.
Comment 5 bkorb 2018-01-25 23:15:56 UTC
I can now confirm it is a nouveau driver issue. By switching to Nvidia, the desktop has become stable.

However, despite the fact the lockup is triggered by a graphics engine fault (and you-all can't do much about that), there are some things KDE can do to make life easier on the victims.

Presumably, I cannot use Ctl-Alt-Shift-F1/PageUp/PageDown because keyboard events are all delivered to a non-responsive thread. You could add a watchdog timer thread that would restart KDE after a sufficiently long time. Viz., if the keyboard events remain untouched for a few seconds, it's likely something pretty bad is going on.