Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add core temperatures for Rockchip RK3588 SoC #1411

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

segabor
Copy link

@segabor segabor commented Mar 11, 2024

My attempt to map SoC core temperatures to virtual cores. Rather a hack than a well reasoned change. There must be a proper way to determine chipset so temperature mapping can be done if condition is met (a refactor will be needed).
Also, data[0] is not set. We have two candidates here, center_thermal and soc_thermal values.

soc_temp_values_under_load

SoC Thermal Drivers

sensors utility bundled with lm-sensors package detected the following thermal drivers

npu_thermal-virtual-0
Adapter: Virtual device
temp1:        +35.2 C  

center_thermal-virtual-0
Adapter: Virtual device
temp1:        +35.2 C  

bigcore1_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  

soc_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  (crit = +115.0 C)

nvme-pci-0100
Adapter: PCI adapter
Composite:    +41.9 C  (low  =  -0.1 C, high = +71.8 C)
                       (crit = +89.8 C)

gpu_thermal-virtual-0
Adapter: Virtual device
temp1:        +35.2 C  

littlecore_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  

bigcore0_thermal-virtual-0
Adapter: Virtual device
temp1:        +36.1 C  

Future Additions

  • NPU temperature
  • GPU temperature

References

@segabor
Copy link
Author

segabor commented Mar 11, 2024

This PR addresses #1404

@BenBE BenBE added the Linux 🐧 Linux related issues label Mar 11, 2024
@BenBE BenBE linked an issue Mar 11, 2024 that may be closed by this pull request
@BenBE
Copy link
Member

BenBE commented Mar 11, 2024

I think, instead of hacks on top of other hacks, the overall libsensors-to-cpucore mapping should be addressed properly.

Comment on lines 216 to 240
/* Map temperature values to Rockchip cores
*
* littlecore -> cores 1..4
* bigcore0 -> cores 5,6
* bigcore1 -> cores 7,8
*/
if (existingCPUs == 8 && String_eq(chip->prefix, "littlecore_thermal")) {
data[1] = temp;
data[2] = temp;
data[3] = temp;
data[4] = temp;
coreTempCount+=4;
continue;
}
if (existingCPUs == 8 && String_eq(chip->prefix, "bigcore0_thermal")) {
data[5] = temp;
data[6] = temp;
coreTempCount+=2;
continue;
}
if (existingCPUs == 8 && String_eq(chip->prefix, "bigcore1_thermal")) {
data[7] = temp;
data[8] = temp;
coreTempCount+=2;
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better solution here, instead of hard-coding CPU counts would be to determine the cores associated with each of the groups and assigning the values from there.

Also: There's some indentation issue …

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix the indentation. You're right about the code, it's just a POC. I like @Explorer09 's idea about syncing with the team behind libsensors.

@Explorer09
Copy link
Contributor

I think, instead of hacks on top of other hacks, the overall libsensors-to-cpucore mapping should be addressed properly.

I believe the libsensors-to-cpu-core mapping shouldn't be handled in htop's side. Perhaps we can ask this question in libsensors upstream to see if there is a portable way to describe the mappings.

My impression is that libsensors.conf is supposed to do the job. Or, maybe, ACPI thermal zone (even though I personally hate ACPI).
Once we have a "protocol" to read the mapping from, we would no longer need these chipset-specific hacks.

@BenBE
Copy link
Member

BenBE commented Mar 13, 2024

@Explorer09 Mind to open a discussion with the libsensors people and link it here for reference? It's not just this bug, but quite a few more, that basically boil down to how to perform this mapping. TIA.

@Wer-Wolf
Copy link

Maybe it would be better to create some sort of naming convention for the temperature channels used for the cpu cores.

I am strongly against adding driver-specific code to libsensors, it was done once and proved to be an maintainance nightmare.

With a proper naming convention in place, sensors.conf can be used to do the necessary adjustments.

@BenBE
Copy link
Member

BenBE commented Mar 14, 2024

We'd also strongly prefer not to deal with any of the driver-specific stuff in htop. And honestly, there are two places things could go: Either into sysfs alongside each sensor providing the core/thread information we need (can figure the rest from procfs -> cpuinfo). Or we get this information from the library we use for fetching the temperature information.

Both have their drawbacks, but the alternative would be, that each user of lmsensors would have to write their own code to figure this out instead, which is effectively worse …

@BenBE
Copy link
Member

BenBE commented Mar 14, 2024

@segabor Please squash your commits in this PR. Cf. the styleguide for details.

@BenBE
Copy link
Member

BenBE commented Mar 23, 2024

The latest rebase looks strange …

@segabor
Copy link
Author

segabor commented Mar 23, 2024

Sorry for the mess @BenBE , hope it's fixed now

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mess seems fixed now …

linux/LibSensors.c Outdated Show resolved Hide resolved
Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor code style stuff and a small suggestion.

linux/LibSensors.c Outdated Show resolved Hide resolved
linux/LibSensors.c Outdated Show resolved Hide resolved
linux/LibSensors.c Outdated Show resolved Hide resolved
* bigcore0 -> cores 5,6
* bigcore1 -> cores 7,8
*/
if (existingCPUs == 8 && String_eq(chip->prefix, "littlecore_thermal")) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the first condition is checking for 8 CPUs anyway, why not pull this as an outer layer. That way this automatically groups the three blocks belonging to the Rockchip cores as a side effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Linux 🐧 Linux related issues
Projects
None yet
4 participants