1

I have read source code of gperftools(https://github.com/gperftools/gperftools/blob/f7c6fb6c8e99d6b1b725e5994373bcd19ffdf8fd/src/profile-handler.cc#:~:text=sevp.sigev_notify_thread_id%20%3D%20syscall(SYS_gettid)%3B).

static void StartLinuxThreadTimer(int timer_type, int signal_number,
                                  int32 frequency, pthread_key_t timer_key) {
  int rv;
  struct sigevent sevp;
  timer_t timerid;
  struct itimerspec its;
  memset(&sevp, 0, sizeof(sevp));
  sevp.sigev_notify = SIGEV_THREAD_ID;
  **sevp.sigev_notify_thread_id = syscall(SYS_gettid);**
  sevp.sigev_signo = signal_number;
  clockid_t clock = CLOCK_THREAD_CPUTIME_ID;
  // other code
}

the above code shows that SIGPROF will only be handled by the current thread which called ProfilerStart function. So how does gperftools get cpu profiling of other threads?

I have read source code and googling my problem.

12 34
  • 13
  • 3

1 Answers1

0

Do note that this code is only used when per_thread_timer_enabled_ is set. Which is not set by default. When this mode is enabled, users are expected to call ProfileHandlerRegisterThread from each thread. There is helper that "automagically" does that at https://github.com/alk/gperf-all-threads, but I am still unsure about being able to depend on this thing (lots of details about symbol interposition and various linking modes).

In "stock" mode we simply do regular setitimer thingy which sends signals to process. So kernel is expected to pick thread, which in practice and on all known OSes it tends to pick currently running thread. There is one issue though, which is when process runs on many cores concurrently, then whatever cpu accounting and itimer expiration code will tend to heavily skew thread that gets chosen.

I see that in practice it happens when multiple threads run extended periods of time. Somehow in Google's production (at least for user-serving systems I dealt with), there was no skew. I.e. it doesn't seem to be big deal in practice at least on some cases.

More details here: https://github.com/golang/go/issues/14434

I.e. this (not enabled by default) per_thread_timer_enabled_ mode is built specifically to try to deal with this problem. But there are difficulties making it stock. (any advice or other contributions I welcome)