I'm running some benchmarks and I'm wondering whether using a "tickless" (a.k.a CONFIG_NO_HZ_FULL_ALL
) Linux kernel would be useful or detrimental to benchmarking.
The benchmarks I am running will be repeated many times using a new process each time. I want to control as many sources of variation as possible.
I did some reading on the internet:
From these sources I have learned that:
In the default configuration (
CONFIG_NO_HZ=y
), only non-idle CPUs receive ticks. Therefore under this mode my benchmarks would always receive ticks.In "tickless" mode (
CONFIG_NO_HZ_FULL_ALL
), all CPUs but one (the boot processor) are in "adaptive-tick" mode. When a CPU is in adaptive-tick mode, ticks are only received if there is more than a single job in the schedule queue for the CPU. The idea being that if there is a sole process in the queue, a context switch cannot happen, so sending ticks is not necessary.
On one hand, not having benchmarks receive ticks seems like a great idea, since we rule out the tick routine as a source of variation (we do not know how long the tick routines take). On the other hand, I think tickless mode could introduce variations in benchmark timings.
Consider my benchmarking scenario running on a tickless kernel. Suppose we repeat a benchmark twice.
Suppose the first run is lucky, and gets scheduled onto an adaptive-tick CPU which was previously idle. This benchmark will therefore not be interrupted by ticks.
When the benchmark is run a second time, perhaps it is not so lucky, and gets put on a CPU which already has some processes scheduled. This run will be interrupted by ticks at regular intervals in order to decide if one of the other processes should we switched in.
We know that ticks impose a performance hit (context switch plus the time taken to run the routine). Therefore the first benchmark run had an unfair advantage, and would appear to run faster.
Note also that a benchmark which initially has an adaptive-tick CPU to itself may find that mid-benchmark another process gets thrown on to the same CPU. In this case the benchmark is initially not receiving ticks, then later starts receiving them. This means benchmark performance can change over time.
So I think tickless mode (under my benchmarking scenario at-least) introduces timing variations. Is my reasoning correct?
One solution would be to use an isolated adaptive-tick CPU for benchmarking (isolcpus
+ taskset
), however we have already ruled out isolated CPUs since this introduces artificial slowdowns in our multi-threaded benchmarks.
Thanks