0

x86-CPUs have invariant TSCs for a long time, i.e. they change the timestamp counter according to a constant frequency, usually the base-clock of the CPU.

If Windows detects an invariant TSC it depends it's QueryPerformanceCounter() on this invariant TSC - unfortunately QueryPerformanceFrequency() is always constant and doesn't represent the TSC's frequency. Visual C++'s runtime relies its high_resoulution_clock on QueryPeformanceCounter() / QueryPerformanceFrequency().

So is the frequency of the timestamp counter really such a reliable source which absolutely doesn't vary? I'm aware that the crystal clock doesn't exactly match the CPU's nominal base-clock, but I'm just curious about whether the clock might slightly vary or even have a temperature-drift.

Bonita Montero
  • 2,817
  • 9
  • 22
  • It has a temperature drift of ca. 1-2s per day. Invariant does not mean it is not drifting. If you compare the current time vs the RDTSC then these two will drift apart over a day. – Alois Kraus Mar 21 '22 at 14:42
  • Linux, for example, only uses RDTSC as an interpolation factor between ticks of the system clock, based on the timer interrupt. (Prob. its tick interval is ultimately derived from the same crystal, but the system clock can get corrected by NTP correction factors or whatever in the interrupt handler.) The coarse clock without interpolation is [`clock_gettime(CLOCK_REALTIME_COARSE)`](https://man7.org/linux/man-pages/man2/clock_gettime.2.html), just reading the global updated by the timer interrupt. IDK if that design is motivated by different long-term accuracy or just ease of correction. – Peter Cordes Mar 21 '22 at 17:02

1 Answers1

0

So I wrote a little C++-program that measures if there's a drift of RDTSC:

#if defined(_WIN32)
    #define NOMINMAX
    #include <Windows.h>
#elif defined(__unix__)
    #include <time.h>
    #include <pthread.h>
#endif
#include <iostream>
#include <cstdint>
#include <vector>
#include <cmath>
#include <string>
#include <charconv>
#include <thread>
#include <cstring>
#include <csignal>
#include <semaphore>
#include <limits>
#if defined(_MSC_VER)
    #include <intrin.h>
#elif defined(__GNUC__)
    #include <x86intrin.h>
#endif

using namespace std;

int main()
{
    static binary_semaphore semStop( false );
    signal( SIGINT, []( int ) { semStop.release( 1 ); } );
#if defined(_WIN32)
    if( !SetThreadAffinityMask( GetCurrentThread(), 1 ) )
        return EXIT_FAILURE;
#elif defined(__unix__)
    cpu_set_t cpuSet;
    CPU_ZERO(&cpuSet);
    CPU_SET(0, &cpuSet);
    if( pthread_setaffinity_np( pthread_self(), sizeof cpuSet, &cpuSet ) )
        return EXIT_FAILURE;
#endif
    auto getSecs = []() -> double
    {
#if defined(_WIN32)
        FILETIME ft;
        GetSystemTimeAsFileTime( &ft );
        return (int64_t)((uint64_t)ft.dwHighDateTime << 32 | ft.dwLowDateTime) / 1.0e7;
#elif defined(__unix__)
        timespec t;
        clock_gettime( CLOCK_REALTIME, &t );
        return (int64_t)t.tv_sec + (int64_t)t.tv_nsec / 1.0e9;
#endif
    };
    vector<uint64_t> tscLog;
    int64_t drift = 0;
    uint64_t sumDrift = 0;
    for( ; ; )
    {
        // ensure a fresh timeslice which we'll give
        // up very soon that we won't be preempted
        this_thread::yield();
        auto waitNext = [&]() -> double
        {
            double lastTime = getSecs(), nextTime;
            while( (nextTime = getSecs()) == lastTime );
            return nextTime;
        };
        double begin = waitNext();
        uint64_t tsc = __rdtsc();
        if( semStop.try_acquire_for( 1s ) )
            break;
        double end = waitNext();
        tsc = __rdtsc() - tsc;
        double secs = end - begin;
        uint64_t tscTicksPerSec = (int64_t)((double)(int64_t)tsc / secs);
        size_t logSizeBefore = tscLog.size();
        if( logSizeBefore )
        {
            int64_t lastDrift = tscTicksPerSec - tscLog.back();
            drift += lastDrift,
            sumDrift += abs( lastDrift );
        }
        tscLog.emplace_back( tscTicksPerSec );
        cout << tscLog.size() << "s: " << tscTicksPerSec;
        if( logSizeBefore )
            cout << ", d: " << drift << ", ad: " << (int64_t)((int64_t)sumDrift / (double)(ptrdiff_t)logSizeBefore)  <<  endl;
        else
            cout << endl;
    }
    auto getAvg = [&]() -> double
    {
        double avg = 0.0;
        for( int64_t logEntry : tscLog )
            avg += (double)logEntry;
        return avg / (ptrdiff_t)tscLog.size();
    };
    auto getStdDev = [&]( double avg )
    {
        auto sqr = []( double d ) { return d * d; };
        double sum = 0.0;
        for( int64_t logEntry : tscLog )
            sum += sqr( (double)logEntry - avg );
        return sqrt( sum / (ptrdiff_t)tscLog.size() );
    };
    auto getDrift = [&]() -> int64_t
    {
        if( tscLog.size() <= 1 )
            return numeric_limits<int64_t>::min();
        int64_t drift = 0;
        for( ptrdiff_t i = 0; i < (ptrdiff_t)tscLog.size() - 1; ++i )
            drift += (int64_t)tscLog[i + 1] - (int64_t)tscLog[i];
        return drift;
    };
    auto fmtDouble = []( double d ) -> string
    {
        char str[32];
        to_chars_result tcr = to_chars( str, str + sizeof str, d, chars_format::fixed );
        if( tcr.ec != errc() )
            return string();
        return string( str, tcr.ptr );
    };
    cout << tscLog.size() << "s" <<  endl;
    double avg = getAvg();
    cout << "avg: " << fmtDouble( avg ) << endl;
    double dev = getStdDev( avg );
    cout << "dev: " << fmtDouble( dev ) << " / " << fmtDouble( 100.0 * dev / avg ) << "%" << endl;
}

The program runs until you press Control C and shows the numer of timestamp-ticks per second, the summed up drift ("d: ") and the average drift so far (summed up absolute drift-differences, "ad: ") and calculates the average clock and the standard deviation of the clock at the end. The measurements aren't precise under Windows but very precise under Linux. On my Linux-PC, a Ryzen 7 1800X on an ASRock AB350 Pro4, the drift reported increasingly by the program is almost zero clock cycles after 40min. The drift slowly varies from a slight minus range symmetical to a plus-range (max. 4.000 clock cycles) around zero. There's for sure no clock-drift of 1-2s per day as @Alois Kraus mentioned.

Bonita Montero
  • 2,817
  • 9
  • 22
  • How do you compile this program. I am getting following error on ubuntu 20.04. g++ -static-libstdc++ -Ofast -Wall -march=skylake -mtune=skylake test1.cc -o test1 test1.cc:15:10: fatal error: semaphore: No such file or directory – Rishabh Singhal Apr 10 '22 at 06:42
  • @RishabhSinghal: You must have a C++20-enabled compiler and runtime and compile it with -std=c++20. – Bonita Montero Apr 11 '22 at 08:01