8

I'm running some benchmarks and I'm wondering whether using a "tickless" (a.k.a CONFIG_NO_HZ_FULL_ALL) Linux kernel would be useful or detrimental to benchmarking.

The benchmarks I am running will be repeated many times using a new process each time. I want to control as many sources of variation as possible.

I did some reading on the internet:

From these sources I have learned that:

  • In the default configuration (CONFIG_NO_HZ=y), only non-idle CPUs receive ticks. Therefore under this mode my benchmarks would always receive ticks.

  • In "tickless" mode (CONFIG_NO_HZ_FULL_ALL), all CPUs but one (the boot processor) are in "adaptive-tick" mode. When a CPU is in adaptive-tick mode, ticks are only received if there is more than a single job in the schedule queue for the CPU. The idea being that if there is a sole process in the queue, a context switch cannot happen, so sending ticks is not necessary.

On one hand, not having benchmarks receive ticks seems like a great idea, since we rule out the tick routine as a source of variation (we do not know how long the tick routines take). On the other hand, I think tickless mode could introduce variations in benchmark timings.

Consider my benchmarking scenario running on a tickless kernel. Suppose we repeat a benchmark twice.

  • Suppose the first run is lucky, and gets scheduled onto an adaptive-tick CPU which was previously idle. This benchmark will therefore not be interrupted by ticks.

  • When the benchmark is run a second time, perhaps it is not so lucky, and gets put on a CPU which already has some processes scheduled. This run will be interrupted by ticks at regular intervals in order to decide if one of the other processes should we switched in.

We know that ticks impose a performance hit (context switch plus the time taken to run the routine). Therefore the first benchmark run had an unfair advantage, and would appear to run faster.

Note also that a benchmark which initially has an adaptive-tick CPU to itself may find that mid-benchmark another process gets thrown on to the same CPU. In this case the benchmark is initially not receiving ticks, then later starts receiving them. This means benchmark performance can change over time.

So I think tickless mode (under my benchmarking scenario at-least) introduces timing variations. Is my reasoning correct?

One solution would be to use an isolated adaptive-tick CPU for benchmarking (isolcpus + taskset), however we have already ruled out isolated CPUs since this introduces artificial slowdowns in our multi-threaded benchmarks.

Thanks

Edd Barrett
  • 3,425
  • 2
  • 29
  • 48

1 Answers1

6

For your "unlucky" scenario above, there has to be an active job scheduled on the same processor. This is not likely to be the case on an otherwise generally idle system, assuming that you have multiple processors. Even if this happens on one or two occasions, that means your benchmark might see the effect of one or two ticks - which hardly seems problematic.

On the other hand if it happens on many more occasions, this would be a general indication of high processor load - not an ideal scenario for running benchmarks anyway.

I would suggest, though, that "ticks" are not likely to be a significant source of variation in your benchmark timings. The scheduler is supposed to be O(1). I doubt you will see much difference in variation between tickless and non-tickless mode.

davmac
  • 20,150
  • 1
  • 40
  • 68
  • Couple of questions about your answer: 1) If my benchmark shares a CPU with several processes that are not ready to run (blocked on IO for example) does my benchmark receive ticks? Looking at one of our benchmark machines, all CPUs have many processes on. Most are daemons sitting around waiting. 2) As I understand, ticks happen "between 100 and 1000 times a second" (see LWN article linked above), so if my benchmark *is* receiving ticks, I think it would be more than a couple, no? – Edd Barrett Jan 15 '16 at 12:03
  • 1) I don't think so, otherwise it's not really tickless... (as you note, there are generally processes assigned to each processor - some of these processes could be sleeping, waiting for network I/O etc - the whole point of a tickless system is to not have ticks when you don't need them). 2) it only receives ticks when there are 2 or more active processes scheduled on the processor, though, right? If that happens more than a few times, your system is loaded. Which it shouldn't be, if you are benchmarking. – davmac Jan 15 '16 at 12:06
  • I'm not sure if blocked processes cause ticks. It's not inconceivable that a tick is used to see if any of those blocked processes became ready. Really not sure though. – Edd Barrett Jan 15 '16 at 12:10
  • Let me put this way - the way you are proposing that it might work, would mean you only get *any* benefit from tickless operation when a processor has only a single process (active or otherwise) in its schedule queue. For this to be the case the number of running processes would have to be less than twice the number of processors. It would be worthless. – davmac Jan 15 '16 at 12:16
  • I think I agree with your last comment. – Edd Barrett Jan 15 '16 at 12:17
  • Good :) As it says in the first link you posted in your question, the "tickless" mode is described fully as "OMIT SCHEDULING-CLOCK TICKS FOR CPUs WITH ONLY ONE RUNNABLE TASK" (the key word here is *runnable* - a process isn't runnable if it's idle waiting). I think that's good evidence. – davmac Jan 15 '16 at 12:51
  • Yes, you are right. So our benchmark should only receive ticks if the system is loaded (it is not), or if our benchmark gets put on the boot (non-adaptive-ticking) processor (hopefully it should not). – Edd Barrett Jan 15 '16 at 14:15