0

In analyzing a program with Vtune, I see that the libc function sched_yield is flagged as a significant hotspot.

Now I see that this function is roughly responsible for context switching; I say roughly because it's the first time I encounter this function, so my understanding is that it runs inside the OS scheduler to provide support for changing the active thread(s).

What does having sched_yield as a major hotspot, mean for my program? Does it mean that I create more threads than I should and the OS is trying to juggle continuous context switching?

What would be a remedy for this situation? Maybe resort to more centralized thread pools to avoid over-spawning threads?

What should I analyze next? Are there "typical" next steps in this situation? Vtune already suggests running an analysis for "threading".

Lorah Attkins
  • 5,331
  • 3
  • 29
  • 63
  • The behavior of `sched_yield` largely depends on the OS. What OS are you running? – Andrey Semashev Jul 29 '21 at 12:08
  • The expected behavior of sched_yield is to let something else run. From the point of view of a profiler the time between calling the function and returning from it would be counted as spent inside it. So maybe it's supposed to look like a hotspot? –  Jul 29 '21 at 12:30
  • Empirical rule of thumb: code using `sched_yield` is usually wrong, very wrong or deep magic™. Take a very long pole and don't touch it. –  Jul 29 '21 at 12:31
  • @AndreySemashev Debian 9 – Lorah Attkins Jul 29 '21 at 12:59
  • @dratenik `sched_yield` is not used directly, the codebase never calls it . Internal implementation of `std::thread` ends up "abusing" it, so I'm trying to figure out what it actually means – Lorah Attkins Jul 29 '21 at 13:03
  • 1
    Is there a call graph that would indicate what feature of std::thread is behind it? Libc seems to only call yield on thread death. –  Jul 29 '21 at 14:41
  • @dratenik That's a very good observation, it would be my assumption as well. Instead of working with "long lived" thead pools, there are many threads created to run a single task and then die. So instead of "re-using" resources, new ones are created and discarded – Lorah Attkins Jul 29 '21 at 15:34

1 Answers1

1

What does having sched_yield as a major hotspot, mean for my program?

On Linux, sched_yield does not necessarily switch to another thread to execute. The kernel does not deschedule the calling thread if there aren't threads that are ready to run on the same CPU. The last part is important, since the kernel will not reschedule a ready to run thread between CPUs upon this call. This is a design tradeoff, as sched_yield is supposed to be a low cost hint to the kernel.

Since sched_yield may just return immediately without doing anything, your code may act as having a busy loop around this call, which will look like a hot spot in your profile. Your code just loops around sched_yield a lot, without doing much else. Such spinning burns a lot of CPU cycles which could be spent for other threads and applications running on the system.

What would be a remedy for this situation?

This depends a lot of your use case. Using sched_yield may be acceptable when you are willing to waste some CPU cycles in exchange for better latency. You have to be conscious about this decision, and even then I would recommend benchmarking a different solution, with proper thread blocking. Linux thread scheduler is quite efficient, so blocking and waking threads is not as expensive as on some other systems.

Often sched_yield is used in custom spin lock algorithms. I would recommend replacing these with pthread components, in particular pthread_cond_t, which allow to properly block and wake up threads. If you're using C++, there are equivalents in the standard library (e.g. std::condition_variable). In other cases it may be worth exploring other blocking APIs, such as select and epoll. The exact solution depends on your use case.

Andrey Semashev
  • 10,046
  • 1
  • 17
  • 27