Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cpufreq support on Darwin for Apple Silicon machines #1272

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bakaid
Copy link
Contributor

@bakaid bakaid commented Jul 26, 2023

This PR adds CPU frequency support on Darwin for Apple Silicon machines.

After doing research on this, it seems to me that the only way to get cpufreq data is to gather some information from the IODeviceTree using IORegistry and then get performance statistics using libIOReport. Unfortunately, the former's structure is not guaranteed to be stable, and the latter has a private API, so again, it can break at any update.
That being said, based on my research on powermetrics and the available device trees of Apple Silicon SoCs, the solution presented here seems to be the current best option available.

I tested this on vanilla M1 and M2, but not on Max/Pro (different core count) or Ultra (multi-die) CPUs. I am reasonably certain it should work on those as well, as we use cluster-type (E or P), not cluster-id for classifying cores, but someone with those CPUs testing this would be very helpful.

Resolves #280

@BenBE BenBE added enhancement Extension or improvement to existing feature MacOS 🍏 MacOS / Darwin related issues labels Jul 26, 2023
@BenBE
Copy link
Member

BenBE commented Jul 26, 2023

This PR adds CPU frequency support on Darwin for Apple Silicon machines.

Nice.

After doing research on this, it seems to me that the only way to get cpufreq data is to gather some information from the IODeviceTree using IORegistry and then get performance statistics using libIOReport.

I hope the source contains some documentation on the details of this.

Unfortunately, the former's structure is not guaranteed to be stable, and the latter has a private API, so again, it can break at any update.

Calling me surprised by this, in the context if this being apple, would be an overstatement …

That being said, based on my research on powermetrics and the available device trees of Apple Silicon SoCs, the solution presented here seems to be the current best option available.

I tested this on vanilla M1 and M2, but not on Max/Pro (different core count) or Ultra (multi-die) CPUs. I am reasonably certain it should work on those as well, as we use cluster-type (E or P), not cluster-id for classifying cores, but someone with those CPUs testing this would be very helpful.

I keep a firm distance from any fruit-based hardware, thus I cannot test this myself either.

Copy link
Member

@BenBE BenBE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash your changes into the commits of this PR to keep the patch set clean.

Comment on lines +693 to +716
# ----------------------------------------------------------------------
# Checks for Darwin features and flags.
# ----------------------------------------------------------------------

if test "$my_htop_cpu" = aarch64; then
AC_ARG_ENABLE([libioreport],
[AS_HELP_STRING([--enable-libioreport],
[enable libIOReport support for getting cpu frequency on Apple Silicon machines @<:@default=yes@:>@])],
[],
[enable_libioreport=yes])
case "$enable_libioreport" in
no)
;;
yes)
AC_CHECK_LIB([IOReport], [IOReportCreateSubscription])
;;
*)
AC_MSG_ERROR([bad value '$enable_libioreport' for --enable_libioreport])
;;
esac
fi

# ----------------------------------------------------------------------

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably follow the pattern used for other libraries and include some auto-detection of library availability.

Comment on lines +808 to +809
AM_CONDITIONAL([HTOP_AARCH64], [test "$my_htop_cpu" = aarch64])

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is handled consistently, as we currently don't handle different CPU architectures any different while on the some underlying OS/platform. There's no difference between Linux on ARM vs. Linux on i386 or x64.

Thus I'm not really in on adding this additional define. Thoughts @ @fasterit @natoscott @cgzones?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use the predefined macro __aarch64__ to detect the CPU architecture, and combine with HTOP_DARWIN defined by the configure script to get what we need?

Comment on lines +5 to +7
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heder order, second blank after headers. Cf. styleguide.

Comment on lines +6 to +26
#if (defined(HAVE_LIBIOREPORT) && defined(HTOP_AARCH64))
#define CPUFREQ_SUPPORT

#include <IOKit/IOKitLib.h>
#include <CoreFoundation/CoreFoundation.h>

/* Private API definitions from libIOReport*/
enum {
kIOReportIterOk = 0,
};
typedef struct IOReportSubscriptionRef *IOReportSubscriptionRef;
typedef CFDictionaryRef IOReportSampleRef;
typedef CFDictionaryRef IOReportChannelRef;
typedef int (^io_report_iterate_callback_t)(IOReportSampleRef ch);
extern void IOReportIterate(CFDictionaryRef samples, io_report_iterate_callback_t callback);
extern CFMutableDictionaryRef IOReportCopyChannelsInGroup(CFStringRef, CFStringRef, void*, void*);
extern IOReportSubscriptionRef IOReportCreateSubscription(void *a, CFMutableDictionaryRef desiredChannels, CFMutableDictionaryRef *subbedChannels, uint64_t channel_id, CFTypeRef b);
extern CFDictionaryRef IOReportCreateSamples(IOReportSubscriptionRef iorsub, CFMutableDictionaryRef subbedChannels, CFTypeRef a);
extern uint32_t IOReportStateGetCount(IOReportChannelRef ch);
extern uint64_t IOReportStateGetResidency(IOReportChannelRef ch, uint32_t index);
extern CFDictionaryRef IOReportCreateSamplesDelta(CFDictionaryRef prev, CFDictionaryRef current, CFTypeRef a);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably all best kept to the implementation file instead of in the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of it is needed in the CpuFreqData struct, most of it I can move to the implementation.


/* Definitions */
typedef struct {
uint32_t num_frequencies;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should include stdint.h in the header

Comment on lines +172 to +174
for (size_t i = 0U; i < CPUFREQ_NUM_CLUSTER_TYPES; i++) {
free(data->cpu_frequencies_per_cluster_type[i].frequencies);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this missing a follow-up call of free(data->cpu_frequencies_per_cluster_type);???

@@ -76,6 +76,9 @@ void Machine_scan(Machine* super) {
DarwinMachine_allocateCPULoadInfo(&host->curr_load);
DarwinMachine_getVMStats(&host->vm_stats);
openzfs_sysctl_updateArcStats(&host->zfs);
#ifdef CPUFREQ_SUPPORT
CpuFreq_update(&host->cpu_freq);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check for this->cpu_freq_ok if it's stored from init earlier?


#include "Machine.h"

#if (defined(HAVE_LIBIOREPORT) && defined(HTOP_AARCH64))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#if (defined(HAVE_LIBIOREPORT) && defined(HTOP_AARCH64))
#if defined(HAVE_LIBIOREPORT) && defined(HTOP_AARCH64)

We're not writing Lisp …

Comment on lines +25 to +26
CpuFreqData cpu_freq;
bool cpu_freq_ok;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By refactoring CpuFreq_init to return CpuFreqData* on success and NULL on failure, the bool can be omitted.

Comment on lines +283 to +287
if (dhost->cpu_freq_ok) {
mtr->values[CPU_METER_FREQUENCY] = dhost->cpu_freq.frequencies[cpu - 1];
} else {
mtr->values[CPU_METER_FREQUENCY] = NAN;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (dhost->cpu_freq_ok) {
mtr->values[CPU_METER_FREQUENCY] = dhost->cpu_freq.frequencies[cpu - 1];
} else {
mtr->values[CPU_METER_FREQUENCY] = NAN;
}
mtr->values[CPU_METER_FREQUENCY] = dhost->cpu_freq_ok ? dhost->cpu_freq.frequencies[cpu - 1] : NAN;

@joske
Copy link

joske commented Dec 21, 2023

@bakaid how did you figure this out? Nothing is to be found about this API at all. Did you reverse engineer powermetrics?

I'm interested in this myself as I wrote the macos (and open/freeBSD) code for btop. I'd like to use parts of this code, but as htop is GPL and btop is Apache licensed, this is problematic. Would you be open to dual license the cpu frequency bit?

@bakaid
Copy link
Contributor Author

bakaid commented Dec 21, 2023

@bakaid how did you figure this out? Nothing is to be found about this API at all. Did you reverse engineer powermetrics?

I'm interested in this myself as I wrote the macos (and open/freeBSD) code for btop. I'd like to use parts of this code, but as htop is GPL and btop is Apache licensed, this is problematic. Would you be open to dual license the cpu frequency bit?

@joske

The only reference available for this is the behaviour of powermetrics indeed.

After this PR has been submitted, I finally had the opportunity to test this on an M2 Pro machine, and realized that unfortunately, this solution is incomplete.

With Pro, Max and Ultra Apple Silicon CPUs, there is more than one P-cluster, and in the case of Ultra, more than one die.
In these cases, the CPU Core Performance States, which seem to work fine with normal M1 and M2, start to become nonsensical, even in powermetrics, especially when CPU load is high.

It seems to me as if there is a mode, where individual cores have their individual scaled frequencies, and a mode, where an entire cluster is driven with the same frequency, the latter especially prominent under larger loads.

This "cluster-wide same frequency mode" is reported in another channel, CPU Complex Performance States. When the load is high, and the cores of a cluster always in this cluster-wide state, the reported residency in CPU Core Performance States for per-core frequencies can be 0 samples, causing even powermetrics to display NaN for CPU <N> active frequency after a divison with 0, and only display a sensible value for E/P<N>-Cluster HW active frequency for the cluster-wide residency.
Even worse, under intermediate loads, there is a mixture of per-core and per-cpu complex states in a single measured interval, making the frequency output of powermetrics practically useless.

The solution, it seems to me, is to merge these two datasets, and synthesize an average frequency for each core, based on the weighted average of its own individual CPU Core Performance States samples AND the CPU Complex Performance States samples of the cluster it belongs to.

This is what I plan to do, but haven't had the time (and reliable access to Pro/Max/Ultra machines) to do it yet.

As for licensing, once this is finalized, I am happy to dual-license the code.

@joske
Copy link

joske commented Dec 21, 2023

@bakaid thanks for all the info! I only have an M1 and an M2 myself. I assumed this code would also work on x86, but apparently the IORegistry nodes /cpus/cpux don't exist there.

Great that you are open to dual license your work! 🎉 🙏

@bakaid
Copy link
Contributor Author

bakaid commented Dec 21, 2023

@joske of course, happy to do it.

Indeed, this solution is Apple Silicon-exclusive, powermetrics works entirely differently for x86.

Now that I see that there is interest in this, I'll try to make progress with the complex performance states.

@joske
Copy link

joske commented Dec 21, 2023

Btw, btop can only display a single frequency, so I was only showing the highest one.

@joske
Copy link

joske commented Jan 16, 2024

@bakaid any progress?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Extension or improvement to existing feature MacOS 🍏 MacOS / Darwin related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CPU Frequency shows as "N/A" on macOS 10.14 Catalina 2015 MacBook Pro 4 core, 8 thread.
4 participants