SUMMARY *** NOTE: If you are reporting a crash, please try to attach a backtrace with debug symbols. See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports *** STEPS TO REPRODUCE 1. Use AMD Threadripper 2950x CPU ( Likely also affects other models, but I am not sure which, likely most Ryzen and possibly AMD FX series processors as well ) 2. Open plasma-systemmonitor 3. Try to add CPU temperature to something. OBSERVED RESULT Tctl/Tdie are not there, CPU Min, Max, and Average all show 0. EXPECTED RESULT CPU temps to be shown SOFTWARE/OS VERSIONS Linux/KDE Plasma: 6.0.2 (available in About System) KDE Plasma Version: 6.0.2 KDE Frameworks Version: 6.0.0 Qt Version: 6.0.2 ADDITIONAL INFORMATION Was told to make a new bug for my specific issue. Probably related to id=445917 and id=452763
Qt version is 6.6.2 not 6.0.2
*** This bug has been marked as a duplicate of bug 474766 ***
https://bugs.kde.org/show_bug.cgi?id=474766 Still have this issue.
Still not sure why k10temp is blacklisted. I need that for my CPU temps. Also not sure why this issue keeps getting closed as a dupe of another ( different ) issue, or as resolved. It's not, and I only posted in the other bug thread because Nate closed this issue as a dupe of that one. Now he has locked that thread, so back to this one I suppose. It is hopefully clear by now that this is a separate issue. The issue is k10temp device is blacklisted in the System Monitor. Without that I cannot see my CPU temps.
I'm having a similar issue on a Threadripper 1950X. All CPU temperatures (min, avg, max, at well as all per-core temperatures) are reported as 0. If I remember correctly, it started with the update to plasma 6. Worked correctly on plasma 5. KDE Plasma Version: 6.1.2 KDE Framework Version: 6.3.0 Qt Version: 6.7.2 The sensors themselves are reporting the correct temperatures, as shown by the "sensors" command: k10temp-pci-00c3 Adapter: PCI adapter Tctl: +67.8°C Tdie: +40.8°C Tccd1: +68.0°C k10temp-pci-00cb Adapter: PCI adapter Tctl: +67.8°C Tdie: +40.8°C
Created attachment 171663 [details] sensors-detect
Its not a Duplicate of bug 474766 The Sensors you assume to be there are not existing on Threadripper 1920x the in the bug 474766 mentioned sensor " Hardware Sensors/coretemp-isa-0000/Core 0" is not existing on my system. Where du you find the mysterious Sernsor that should show my cpu temperature when lm-sensors cant find it? The last 5 years Tctl and Tdie was the only option. Find the full list of all sensors in my system after sensors-detect with yes to all. sensors detect attached ~ # sensors nct6779-isa-0290 Adapter: ISA adapter Vcore: 424.00 mV (min = +0.00 V, max = +1.74 V) in1: 1.08 V (min = +0.00 V, max = +0.00 V) ALARM AVCC: 3.30 V (min = +0.00 V, max = +0.00 V) ALARM +3.3V: 3.30 V (min = +0.00 V, max = +0.00 V) ALARM in4: 1.84 V (min = +0.00 V, max = +0.00 V) ALARM in5: 912.00 mV (min = +0.00 V, max = +0.00 V) ALARM in6: 1.37 V (min = +0.00 V, max = +0.00 V) ALARM 3VSB: 3.44 V (min = +0.00 V, max = +0.00 V) ALARM Vbat: 3.25 V (min = +0.00 V, max = +0.00 V) ALARM in9: 0.00 V (min = +0.00 V, max = +0.00 V) in10: 832.00 mV (min = +0.00 V, max = +0.00 V) ALARM in11: 864.00 mV (min = +0.00 V, max = +0.00 V) ALARM in12: 1.67 V (min = +0.00 V, max = +0.00 V) ALARM in13: 920.00 mV (min = +0.00 V, max = +0.00 V) ALARM in14: 872.00 mV (min = +0.00 V, max = +0.00 V) ALARM fan1: 2909 RPM (min = 0 RPM) fan2: 910 RPM (min = 0 RPM) fan3: 0 RPM (min = 0 RPM) fan4: 1070 RPM (min = 0 RPM) fan5: 0 RPM (min = 0 RPM) SYSTIN: +29.0°C (high = +0.0°C, hyst = +0.0°C) ALARM sensor = thermistor CPUTIN: +31.0°C (high = +80.0°C, hyst = +75.0°C) sensor = thermistor AUXTIN0: +7.0°C sensor = thermistor AUXTIN1: +35.0°C sensor = thermistor AUXTIN2: +33.0°C sensor = thermistor AUXTIN3: +33.0°C sensor = thermistor SMBUSMASTER 0: +56.5°C PCH_CHIP_CPU_MAX_TEMP: +0.0°C PCH_CHIP_TEMP: +0.0°C PCH_CPU_TEMP: +0.0°C PCH_MCH_TEMP: +0.0°C PCH_DIM0_TEMP: +0.0°C TSI0_TEMP: +56.8°C intrusion0: ALARM intrusion1: ALARM beep_enable: disabled k10temp-pci-00cb Adapter: PCI adapter Tctl: +56.8°C Tdie: +29.8°C nvme-pci-4100 Adapter: PCI adapter Composite: +43.9°C (low = -273.1°C, high = +84.8°C) (crit = +84.8°C) Sensor 1: +43.9°C (low = -273.1°C, high = +65261.8°C) Sensor 2: +47.9°C (low = -273.1°C, high = +65261.8°C) iwlwifi_1-virtual-0 Adapter: Virtual device temp1: N/A amdgpu-pci-0a00 Adapter: PCI adapter vddgfx: 680.00 mV fan1: 511 RPM (min = 0 RPM, max = 3600 RPM) edge: +37.0°C (crit = +100.0°C, hyst = -273.1°C) (emerg = +105.0°C) junction: +50.0°C (crit = +110.0°C, hyst = -273.1°C) (emerg = +115.0°C) mem: +66.0°C (crit = +108.0°C, hyst = -273.1°C) (emerg = +113.0°C) PPT: 51.00 W (cap = 244.00 W) k10temp-pci-00c3 Adapter: PCI adapter Tctl: +56.8°C Tdie: +29.8°C Tccd1: +56.5°C
I tried to hunt down this issue. While creating a working debug environment is a major pain in the ass, here is what i could get: 1. ksystemstats creates a KSysGuard FeatureSensors class for each CPU Core 2. During the update run in single CPU cores, one update call for each core is triggered 3. Here is my blackbox. I have no glue how to debug the call from ksystemstats to ksysguard, maybe someone can help? 4. After the call to KSysGuard::FeatureSensors::update() a call to ::value() returns an invalid QVariant. So, either the construction of the feature sensor, or KSysGuard is broken here.
I don't think it's KSysGuard in this case, because the KSysGuard program itself actually works for me. I can see the temps with KSysGuard no problem. If it was at fault I'd expect it to error there as well, but maybe they are not functioning the same?
Hi! (In reply to Thomas Berger from comment #8) > 3. Here is my blackbox. I have no glue how to debug the call from > ksystemstats to ksysguard, maybe someone can help? ksystemstats doesn't call ksysguard(d), it is everything in the /plugins/ source directory. In your case, /plugins/cpu/linuxcpuplugin.cpp.
(In reply to Jiri Palecek from comment #10) > Hi! > > ksystemstats doesn't call ksysguard(d), it is everything in the /plugins/ > source directory. In your case, /plugins/cpu/linuxcpuplugin.cpp. `linuxcpu.cpp`, line 34: > m_temperature = KSysGuard::makeSensorsFeatureSensor(QStringLiteral("temperature"), chipName, feature, this);
(In reply to Thomas Berger from comment #11) > (In reply to Jiri Palecek from comment #10) > > Hi! > > > > ksystemstats doesn't call ksysguard(d), it is everything in the /plugins/ > > source directory. In your case, /plugins/cpu/linuxcpuplugin.cpp. > > `linuxcpu.cpp`, line 34: > > > m_temperature = KSysGuard::makeSensorsFeatureSensor(QStringLiteral("temperature"), chipName, feature, this); Yeah. And? I can't see your point. Incidentally, does this line run when you try to debug ksystemstats (run with arguments "--replace --remain"). Can you verify that > $ kstatsviewer --remain cpu/cpu0/temperature > cpu/cpu0/temperature 54.75 > cpu/cpu0/temperature 55.375 works?
(In reply to Thomas Berger from comment #11) > (In reply to Jiri Palecek from comment #10) > > Hi! > > > > ksystemstats doesn't call ksysguard(d), it is everything in the /plugins/ > > source directory. In your case, /plugins/cpu/linuxcpuplugin.cpp. > > `linuxcpu.cpp`, line 34: > > > m_temperature = KSysGuard::makeSensorsFeatureSensor(QStringLiteral("temperature"), chipName, feature, this); Maybe I see what you're getting at. This creates a SensorsFeatureSensor from libksysguard (that's different codebase than ksysguard). You might want to debug this: https://github.com/KDE/libksysguard/blob/079998ece198ee210fa16e5fd1f13f49473c94b6/systemstats/SensorsFeatureSensor.cpp#L145
Sorry for the confusion, of course i was talking about libksysguard. I don't know why, but somehow i dropped the "lib" prefix. To be able to debug this, i need a debug build of both parts, but the build system makes this hard for me right now, as i am unable to generate a appropriate *Targets.cmake file from the build directory of my libksysguard debug builds, as generated with the install paths only. And there is no good documentation i could find for such debug environments. I compared how the sensor object is created from both plugins (the lmsensors and the CPU plugin) and at first glance i could not see a major difference. My usual debug scenario also does not work here, because i can not run this inside a virtual VM, as we have hardware sensors, not be able to passed through into a VM (at least not to my knowledge).
(In reply to Thomas Berger from comment #14) > Sorry for the confusion, of course i was talking about libksysguard. I don't > know why, but somehow i dropped the "lib" prefix. > > To be able to debug this, i need a debug build of both parts, but the build > system makes this hard for me right now, as i am unable to generate a > appropriate *Targets.cmake file from the build directory of my libksysguard > debug builds, as generated with the install paths only. Oh. I never needed any of that. I can only suggest: - what is your distribution? Maybe it already has debug symbol packages installable. (eg. debian -> libksysguardsystemstats2-dbgsym) - do you really need to build ksystemstats with rpath? If not, you don't need to worry about any Targets.cmake files, just point to the debugging libraries with LD_LIBRARY_PATH=/path/to/libksysguard-src/.../bin ksystemstats ... - maybe, you don't need to debug libksysguard. Can you 100% ensure that this point is true > 4. After the call to KSysGuard::SensorsFeatureSensor::update() a call to ::value() returns an invalid QVariant. specifically, that it calls update() on the correct sensor? You could just place a breakpoint in sensors_get_value from libsensors and check the return value. Or, if you have ltrace, you could run ltrace -e sensors_get_value@* ksystemstats ... to see at least if there aren't any errors.
I was able to create a debug prefix. I found a very strange behavior: 1. In `linuxcpu.cpp, line 89, we call `m_temperature->update()` 2. m_temperature is a pointer to `KSysGuard::SensorProperty` created by an earlier call to KSysGuard::makeSensorsFeatureSensor The assumption would be, that this call ends up in the overload `SensorsFeatureSensor::update()`, but instead it calls the base class implementation SensorProperty::update() which is empty. I installed all all plugins in my prefix, and every other plugin using `makeSensorsFeatureSensor` gets the update call "served" by `SensorsFeatureSensor::update()` and i can break in this function as well. My initial assumption was, that this is a linker issue, so i moved from gcc-14 to clang-18.1, but the effect stays the same. The lmsensors and the gpu plugin work fine, i can't find the difference in code, that could cause this. Here is the call from the cpu plugin: ``` * frame #0: 0x00007f8e0cfbf6fa libKSysGuardSystemStats.so.2`KSysGuard::SensorProperty::update(this=0x0000558f0e93dbe0) at SensorProperty.h:110:5 frame #1: 0x00007f8e07af39f7 ksystemstats_plugin_cpu.so`LinuxCpuObject::update(this=0x0000558f0e8cc6a0, system=7800, user=47299, wait=460, idle=575242) at linuxcpu.cpp:90:20 ``` And here is the call from the lmsensors plugin for another sensor ``` * frame #0: 0x00007fbde133f9c0 libKSysGuardSystemStats.so.2`KSysGuard::SensorsFeatureSensor::update(this=0x0000559ed44ac480) at SensorsFeatureSensor.cpp:147:22 frame #1: 0x00007fbddbe9ea9b ksystemstats_plugin_lmsensors.so`LmSensorsPlugin::update(this=0x0000559ed44aa570) at lmsensors.cpp:71:17 ```
While hunting this down, i have found another issues here: https://bugs.kde.org/show_bug.cgi?id=490675 I could imagine, that this is related. Overriding the same property value multiple times, could trigger some other issue with the implementation of properties or stuff like this. I am not deep enough in Qt to understand the implications.
(In reply to Thomas Berger from comment #16) > I was able to create a debug prefix. I found a very strange behavior: > > 1. In `linuxcpu.cpp, line 89, we call `m_temperature->update()` > 2. m_temperature is a pointer to `KSysGuard::SensorProperty` created by an > earlier call to KSysGuard::makeSensorsFeatureSensor Yeah, but is it really? It could be overwritten here https://github.com/KDE/ksystemstats/blob/b994c553f2e5d5d235f289c0112f1509b18e4e45/plugins/cpu/linuxcpu.cpp#L57 or here https://github.com/KDE/ksystemstats/blob/b994c553f2e5d5d235f289c0112f1509b18e4e45/plugins/cpu/cpu.cpp#L77. Although it totally shouldn't and I couldn't find any recent change in the (scant) git history that could do anything with it. It could be some undefined behavior, but I couldn't find that either. Maybe it could be some linker snafu? So to check it, if you can, please try this: 1) run ksystemstats under valgrind 2) run gdb (I see you are using lldb, but lldb is totally useless on Debian, so I'm using it) 3) enter commands into gdb: > target remote |vgdb # to connect to the valgrind-ed program and debug it > break LinuxCpuObject::update > cont # to set breakpoint and continue > print m_temperature # to print the address of the SensorProperty, eg "(KSysGuard::SensorsFeatureSensor *) 0x8a68290" > monitor check_memory defined 0x8a68290 # to print where the sensor was allocated. use the same memory address as returned from the previous command # this uses valgrind's bookkeeping info # and last > info vtbl m_temperature # to check the dynamic type of m_temperature and post the output from gdb.
Yeah, that proved some of my assumptions yesterday: The object is the same allocated from ``` void LinuxCpuObject::makeSensors() { BaseCpuObject::makeSensors(); m_frequency = new KSysGuard::SensorProperty(QStringLiteral("frequency"), this); if (!m_temperature) { m_temperature = new KSysGuard::SensorProperty(QStringLiteral("temperature"), this); } } ``` And the vtable clearly shows that we are using the base class. This led me down the correct path: - We define a Sensor via makeSensorsFeatureSensor for each CPU on the first found k10temp chip - for the other found chips, we override the newly created sensors with null ptrs This happens, because a property is added on sensor creation to our SensorObject (in this case the LinuxCpuObject). `makeSensorsFeatureSensor` bails out, if the sensor already exists on our SensorObject. And it does, we just created it. If makeSensorsFeatureSensor bails out, a nullptr is returned. After the call to addSensors, that leads down to the creation of our temperature sensors, `initialize` is called oin all cpu objects, adding the "missing" sensors with default implementations. While there are multiple ways to fix this, none of them seems like a good idea. We would have to map the temperatures to the cores on the appropriate DIE, or the user looses important information (imagine one DIE sitting near the upper limit because of an thermal/contact issue, but the first DIE is ok ....). I would propose that we wait how the discussion in https://bugs.kde.org/show_bug.cgi?id=490675 plays out, before taking actions here. Thx btw, now i learned something new today!
(In reply to Thomas Berger from comment #19) > This led me down the correct path: > - We define a Sensor via makeSensorsFeatureSensor for each CPU on the first > found k10temp chip > - for the other found chips, we override the newly created sensors with null > ptrs Yeah! Good catch. > While there are multiple ways to fix this, none of them seems like a good > idea. Well, for a start, guarding the call to makeSensorsFeatureSensor with an if (!m_temperature) seems warranted. Or else, we could get rid of m_temperature altogether (and make the code cleaner). > We would have to map the temperatures to the cores on the appropriate > DIE, or the user loses important information (imagine one DIE sitting near > the upper limit because of an thermal/contact issue, but the first DIE is ok > ....). Yeah, that's exactly true. Also pertains to dual cpu setups. Maybe it could suffice to map the sensors to correct packages through differing NUMA nodes (but are they always different?) and then use the Tccd* sensors and die numbers from /sys/devices/system/cpu/*/topology. That would need experimentation with the actual hardware. > I would propose that we wait how the discussion in > https://bugs.kde.org/show_bug.cgi?id=490675 plays out, before taking actions > here. Yeah but that's for somebody else to decide. > Thx btw, now i learned something new today! Good to hear that.
I think the NUMA layout depends on how they have the CPU set up. I know on mine I have a few options, and if I set the memory to interleaved, it will only show as 1 NUMA node.
I'm having the same issue with thermal monitor panel widget and an Intel i7 7700k, is this the appropriate bug or is there another one I was unable to find? lm_sensors finds both coretemp-isa-0000 with 4 temperature readings as well as an nct6793 with a CPUTIN reading, but neither of these show up in thermal monitor panel widget. It does pick up my disks and GPU, and it used to find coretemp in older plasma versions (currently 5.27.11, not sure which is the last version that worked but it was less than a year ago) So my CPU temperature readout which was previously working fine now says "OFF" which is a bit strange given that my system is running just fine…