Bug 389545 - Inform about critical system messages, e.g. CPU cooling problem
Summary: Inform about critical system messages, e.g. CPU cooling problem
Status: CONFIRMED
Alias: None
Product: plasmashell
Classification: Plasma
Component: general (show other bugs)
Version: master
Platform: Other Linux
: NOR wishlist
Target Milestone: 1.0
Assignee: David Edmundson
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-01-28 10:27 UTC by Gregor Mi
Modified: 2024-07-09 19:03 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Gregor Mi 2018-01-28 10:27:32 UTC
Today, I ran into a CPU cooling problem on my laptop. I needed to find out a bunch of things to track the problem down. Maybe this report can help to implement something into Plasma to inform and warn users about similar problems. I use openSUSE Tumbleweed 20180109.

### Suddenly the system shuts down

I was compiling KDE with kdesrc-build and after 15 minutes or so the system shuts down. I restarted the system and ran kdesrc-build again. After a few minutes the system shuts down again. I did it again and after even shorter time then the system shuts down. This is not normal, so I started to investigate.

### Investigate shutdown cause with journalctl

I ran $ sudo journalctl -a
because I read that this is the place to start to look for problems. I saw that there is a huge amount of messages; and to scroll to the end took such a long time that aborted the command.

With $ sudo journalctl -S 2018-01-28
I could narrow the search down. Looking through the messages I found this:

    Jan 28 09:38:50 linux-vu7g kernel: thermal thermal_zone0: critical temperature reached (89 C), shutting down

### Use KSystemLog

Later I tried to find the error using KSystemLog but that was possible because it didn't show enough messages. I wasn't able to configure it to show more messages. I found the setting "Maximum lines displayed" and increased it from 1000 to 5000 but that didn't have any effect. Bug report later.
I wonder which GUI tool is currently recommended for Plasma to read the system log history?

### Find out the meaning of the critical temparture message

I searched the web about the thermal error message from the log and found this:

"This is a really serious message. The computer only does this when there's a cooling problem. Under no circumstance the temperature should reach values this high. This immediate shutdown is an action triggered by the thermal sensor that operates independent of the operating system. It prevents the processor from getting damaged beyond repair. The bottomline is you can't prevent this protection measure and you should not ever want to do this if it had been possible. What you should do first now is checking what's wrong with cooling and solve the problem. I've experienced this problem a few years ago and it turned out to be the paste between the heatsink and the processor." (https://unix.stackexchange.com/questions/212628/critical-temperature-reached-dont-shut-down)

So, two questions arise:

1) Is this really a hardware problem? Because I thought the CPU is throttled automatically to avoid fatal temperature.
2) How can I monitor the CPU temperature to avoid a sudden shutdown?

### Monitor CPU temperature

I found some hints for tools to monitor the CPU temperature here: https://askubuntu.com/questions/15832/how-do-i-get-the-cpu-temperature

This command can be used without installing any new tool:

$ cat /sys/class/thermal/thermal_zone0/temp

which shows one number, e.g. 61000, which means 61 °C.

I used this to watch it:

$ watch cat /sys/class/thermal/thermal_zone0/temp

I also installed the package hardinfo which also installs the sensors package.
Hardinfo is supposed to also the CPU temperature but the corresponding fields were empty.

The command

$ sensors

however shows some useful information on the console, among others this one:

acpitz-virtual-0
Adapter: Virtual device
temp1:        +61.0°C  (crit = +89.0°C)
temp2:        +61.0°C  (crit = +89.0°C)

Note that also the critical temperature is shown. Other suggested GUI tools like psensors or xsensors were not available in my software repos so I didn't try those.

For Plasma I found these the KDE store:

- "Simple System Monitor" (https://store.kde.org/p/1173509/) which among other things shows the CPU temperature. Works out of the box.
- "Thermal Monitor " (https://store.kde.org/p/998915/). Has to be configured manually.

Is there also a built-in Plasma tool which is recommened to watch CPU temperature?

### Way forward

Next, I will read this article to find out what is wrong with my laptop: https://itsfoss.com/reduce-overheating-laptops-linux/

As indicated at the beginning, it would be nice if Plasma could tell the user - e.g. after the system has started up - if there are any critical log messages that require the user's attention.
Comment 1 Kai Uwe Broulik 2018-02-05 16:04:04 UTC
We could perhaps have a daemon similar to kwrited or freespacenotifier that shows a warning when this happens. No idea how to access this information, though.
Comment 2 Unknown 2022-05-28 16:37:58 UTC
It would be if KDE could inform me about high or very high temps of CPU/GPU, maybe SSD drives too.