| Summary: | Sensor system monitor prints NVMe drive temperature of ~50°C in text but draws graph at absolute zero / 0K / -273°C | ||
|---|---|---|---|
| Product: | [Applications] plasma-systemmonitor | Reporter: | Dennis Schridde <heri+kde> |
| Component: | general | Assignee: | KSysGuard Developers <ksysguard-bugs> |
| Status: | RESOLVED UPSTREAM | ||
| Severity: | normal | CC: | ahiemstra, nate, plasma-bugs-null |
| Priority: | NOR | ||
| Version First Reported In: | 5.23.3 | ||
| Target Milestone: | --- | ||
| Platform: | Gentoo Packages | ||
| OS: | Linux | ||
| Latest Commit: | Version Fixed/Implemented In: | ||
| Sentry Crash Report: | |||
| Attachments: |
screenshot of NVMe drive temperature sensor widget
screenshot of NVMe drive temperature sensor settings screen |
||
|
Description
Dennis Schridde
2021-11-10 21:51:48 UTC
P.S. I will need some instructions on what information you will need to debug this. One interesting thing here is that there's two separate decimal separators in the screenshot, so there seems to be something weird going on, one possibility is that something fails to convert from string to number, uses a default value and then we end up displaying that. It would be helpful if you can share the output of the "sensors" command. Additionally, the output of "localectl" might help. (In reply to Arjen Hiemstra from comment #2) > One interesting thing here is that there's two separate decimal separators > in the screenshot, so there seems to be something weird going on, one > possibility is that something fails to convert from string to number, uses a > default value and then we end up displaying that. It would be helpful if you > can share the output of the "sensors" command. Additionally, the output of > "localectl" might help. ❯ sensors amdgpu-pci-0a00 Adapter: PCI adapter vddgfx: N/A vddnb: N/A edge: +36.0°C nvme-pci-0900 Adapter: PCI adapter Composite: +44.9°C (low = -273.1°C, high = +83.8°C) (crit = +83.8°C) Sensor 1: +44.9°C (low = -273.1°C, high = +65261.8°C) Sensor 2: +63.9°C (low = -273.1°C, high = +65261.8°C) amdgpu-pci-0100 Adapter: PCI adapter vddgfx: 706.00 mV fan1: 1490 RPM (min = 0 RPM, max = 3500 RPM) edge: +29.0°C (crit = +94.0°C, hyst = -273.1°C) power1: 6.23 W (cap = 48.00 W) k10temp-pci-00c3 Adapter: PCI adapter Tctl: +36.9°C ❯ localectl System Locale: LANG=en_GB.utf8 VC Keymap: n/a X11 Layout: n/a (In reply to Dennis Schridde from comment #3) > nvme-pci-0900 > Adapter: PCI adapter > Composite: +44.9°C (low = -273.1°C, high = +83.8°C) > (crit = +83.8°C) > Sensor 1: +44.9°C (low = -273.1°C, high = +65261.8°C) > Sensor 2: +63.9°C (low = -273.1°C, high = +65261.8°C) So it seems that the display is actually correct, just that the numbers reported by the sensor are completely bonkers. Created attachment 143898 [details]
screenshot of NVMe drive temperature sensor settings screen
I disabled the checkbox at "Automatic Y Data Range" in the "Data Ranges" section of the settings screen. This resolved the issue: The temperature graphs are now displayed in a useful way. I think this is also the only reliable and correct way to handle such situation.
Who would I report the wrong min/max values of the NVMe temperature sensor to? Is that something lm-sensors needs to fix, or do I report it to the Linux kernel developers? If so, which one?
What KDE Plasma could do: Display a warning in the settings, if the automatically detected range seems unreasonable: If the actual value is visualised at the same pixel height as one of the min/max values from the sensor and/or the range spans thousands of degrees Celsius, that is probably not what the user intended. If there is nothing you can do in this area, I suggest to close this report.
Most likely it's the driver that's reporting those values. And now the different separators make sense, since in the one case they're thousands separators, not decimal. And handling that on the system monitor side would be tricky. While such a range doesn't necessarily make much sense for a temperature graph, other values should be able to have even larger ranges. Using a fixed range is a simpler solution here. So I'll mark this as resolved, since this is an upstream issue. |