I'm doing some experiments with low-latency programming. I want to eliminate context switching, and be able to reliably measure latency without affecting performance too much.
To begin with, I wrote a program that requests time in loop 1M times and than prints statistics (code below), since I wanted to know how much time the call to the timer is taking. Surprisingly, the output is the following (in microseconds):
Mean: 0.59, Min: 0.49, Max: 25.77
Mean: 0.59, Min: 0.49, Max: 11.73
Mean: 0.59, Min: 0.42, Max: 14.11
Mean: 0.59, Min: 0.42, Max: 13.34
Mean: 0.59, Min: 0.49, Max: 11.45
Mean: 0.59, Min: 0.42, Max: 14.25
Mean: 0.59, Min: 0.49, Max: 11.80
Mean: 0.59, Min: 0.42, Max: 12.08
Mean: 0.59, Min: 0.49, Max: 21.02
Mean: 0.59, Min: 0.42, Max: 12.15
As you can see, although the average time is less than one microsecond, there are spikes of up to 20 microseconds. That's despite the fact that the code runs on a dedicated core (affinity set to a specific core, while the affinity of the init process is set to a group of other cores), and that hyper-threading is disabled on the machine. Tried it with multiple kernel versions, including preemptive and RT, and the results are essentially the same.
Can you explain the huge difference between mean and max?
Is the problem in the calls to the timer, or with a process isolation?
I also tried this with calls to other timers -
- CLOCK_THREAD_CPUTIME_ID
,
- CLOCK_MONOTONIC
,
- CLOCK_PROCESS_CPUTIME_ID
- and the pattern observed was the same...
#include <time.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#define min(a, b) ((a) < (b) ? (a) : (b))
#define max(a, b) ((a) > (b) ? (a) : (b))
uint64_t
time_now()
{
struct timespec ts;
clock_gettime(CLOCK_REALTIME, &ts);
return ts.tv_sec * 1000000000 + ts.tv_nsec;
}
void
set_affinity(int cpu)
{
cpu_set_t set;
CPU_ZERO(&set);
CPU_SET(cpu, &set);
if (sched_setaffinity(0, sizeof(set), &set))
{
perror("sched_setaffinity");
}
}
#define NUM 1000000
#define LOOPS 10
int
main(int argc, char **argv)
{
set_affinity(3);
for (int loop = 0; loop < LOOPS; ++ loop)
{
uint64_t t_0 = time_now();
uint64_t sum_val = 0;
uint64_t max_val = 0;
uint64_t min_val = uint64_t(-1);
for (int k = 0; k < NUM; ++ k)
{
uint64_t t_1 = time_now();
uint64_t t_diff = t_1 - t_0;
sum_val += t_diff;
min_val = min(t_diff, min_val);
max_val = max(t_diff, max_val);
t_0 = t_1;
}
printf("Mean: %.2f, Min: %.2f, Max: %.2f\n", ((double )sum_val)/NUM/1000, ((double )min_val)/1000, ((double)max_val)/1000);
}
return 0;
}