As you've observed, the system clock on Apple Silicon uses one tick per 41.667 nanoseconds (125/3) compared to the 1 tick per nanosecond on x86 architectures. And for compatibility, Rosetta uses the old 1:1 value.
While investigating this mismatch to solve a different problem I found this blog post which describes the mismatch in detail.
The author has published a free utility Mints which allows investigation of this mismatch. I just tried it out on my M2 Max, both as a native app and under Rosetta, getting these outputs:
Running native on Apple Silicon:
Timebase numerator = 125
denominator = 3
factor = 41.666666666666664
Mach Absolute Time (raw) = 42216039415365
Mach Absolute Time (corr) = 1759001642306875
And a short time later on Rosetta:
Running as Intel code:
Timebase numerator = 1
denominator = 1
factor = 1.0
Mach Absolute Time (raw) = 1759003916431000
Mach Absolute Time (corr) = 1759003916431000
The TL;dr of this comparison is that both Mach Absolute Times start at 0, so they are always at a constant ratio to each other. To get the arm64 MAT from the x86/Rosetta MAT, you simply divide by 125/3 (multiply by 3/125), at least on M1 and M2.
To future-proof your code in case Apple changes it again, you should properly determine the ratio programmatically. On arm64 you can retrieve it, as you've indicated, from the structure returned by mach_timebase_info()
.
Given that you can determine the ratio more accurately on arm64, I'd recommend converting all your values to nanoseconds to match the x86 output. This is the simplest approach as you can simply get the mach_timebase_info()
ratio once at startup, and then always multiply it by your mach_absolute_time()
values.
Perhaps confirming this suggestion, the documentation for mach_absolute_time
suggests a nanosecond approach:
Prefer to use the equivalent clock_gettime_nsec_np(CLOCK_UPTIME_RAW)
in nanoseconds.
This is documented on the manpage
CLOCK_UPTIME_RAW clock that increments monotonically, in the same man-
ner as CLOCK_MONOTONIC_RAW, but that does not incre-
ment while the system is asleep. The returned value
is identical to the result of mach_absolute_time()
after the appropriate mach_timebase conversion is
applied.
One other thing to note: some data fields in macOS, notably the user and kernel per-process times, use the "native" tick value. Specifically proc_taskinfo->pti_total_user
and proc_taskinfo->pti_total_system
. Under Rosetta, the above ratio doesn't help resolve this disparity.
But there is another source of this offset ratio calculation that I've found, which appears robust to Rosetta, in the IO Registry. In the device tree for each CPU there is a timebase-frequency
value (along with many other clock-based frequencies that match) that works out to 1000000000 on x86 and 24000000 on arm64. Since the device tree is saved at boot time, fetching it, even under Rosetta, reveals the original values.
That ratio (1000/24) is exactly equal to 125/3, so if you choose not to convert to nanoseconds as above, and you are on Rosetta, you should be able to take your arbitrary mach_absolute_time()
and divide it by 1000000000/timebase-frequency to get to your desired "native" absolute time.
If you're scripting, you could fetch the value from the command line. (The bytes are little endian.) On x86:
➜ ~ ioreg -c IOPlatformDevice | grep timebase
| | "timebase-frequency" = <00ca9a3b>
On arm64, even with Rosetta:
➜ ~ ioreg -c IOPlatformDevice | grep timebase
| | "timebase-frequency" = <00366e01>
Programmatically, I've implemented this in Java using JNA here.
Here is some (untested) C code that should fetch the values you need. Exception/failure handling and other languages are left as an exercise for the reader:
#include <CoreFoundation/CoreFoundation.h>
#include <IOKit/IOKitLib.h>
uint32_t timebase;
kern_return_t status;
CFDictionaryRef matching = NULL;
CFTypeRef timebaseRef = NULL;
io_iterator_t iter = 0;
io_name_t name;
matching = IOServiceMatching("IOPlatformDevice");
// if (matching == 0) { handle exception }
// this call releases matching so we don't have to
status = IOServiceGetMatchingServices(kIOMainPortDefault, matching, &iter);
// if (status != KERN_SUCCESS) { handle failure }
while ((entry = IOIteratorNext(iter)) != 0) {
status = IORegistryEntryGetName(entry, name);
if (status != KERN_SUCCESS) {
IOObjectRelease(entry);
continue;
}
// don't match "cpu" but match "cpu0" etc.
if (strlen(name) > 3 && strncmp(name, "cpu", 3) == 0)) {
break;
}
IOObjectRelease(entry);
}
// if (entry == 0) { handle "didn't find cpu" }
timebaseRef = IORegistryEntryCreateCFProperty(
entry, CFSTR("timebase-frequency"), kCFAllocatorDefault, 0);
// validate data length >= 4 bytes
// size_t timebaseLength = CFDataGetLength(timebaseRef);
// if (timebaseLength < 4) { handle failure }
CFDataGetBytes(timebaseRef, CFRangeMake(0, 4), (UInt8 *) &timebase);
// timebase should now be 1000000000 on x86, 24000000 on arm64
CFRelease(timebaseRef);
IOObjectRelease(iter);
IOObjectRelease(entry);
Summary:
- The easiest thing to do if you control both Rosetta and native processes is always convert everything to nanos; don't use
mach_absolute_time()
, and always use clock_gettime_nsec_np(CLOCK_UPTIME_RAW)
which is consistent whether or not Rosetta is in use.
- If you are on Rosetta and do not have control of the arm64 native processes, and must find the correction factor to convert, find the timebase-frequency from the IORegistry, and use this to apply the correction yourself (multiply by timebase-frequency/1 billion) = 24 million / 1000 million = 24/1000 = 3/125