1

When my program receives a TCP connection request, it calls spawn_thread() to add the total thread number, and spawns a child thread to handle TCP connection.When the child thread complete the task, it closes the TCP connection, and calls delete_thread() to decrease the total thread number.

I use the following DTrace script to trace the total thread number, and the output includes time stamp, thread ID, and the total thread number:

#!/usr/sbin/dtrace -qs

pid$1:asg:delete_thread:entry,
pid$1:asg:delete_thread:return,
pid$1:asg:spawn_thread:entry,
pid$1:asg:spawn_thread:return
{
    printf("%Y [%d][%s][%s]:thread_pool.total_threads=%d\n", walltimestamp, tid, probefunc, probename, *(int32_t*)copyin(0x00634548, 4));
}  

But after running a while, I find the following strange output logs:

2014 Mar  3 02:13:49 [18355][delete_thread][entry]:thread_pool.total_threads=99
2014 Mar  3 02:13:49 [18355][delete_thread][return]:thread_pool.total_threads=98
2014 Mar  3 02:13:49 [57][spawn_thread][entry]:thread_pool.total_threads=98
2014 Mar  3 02:14:04 [57][spawn_thread][entry]:thread_pool.total_threads=98
2014 Mar  3 02:14:04 [57][spawn_thread][return]:thread_pool.total_threads=99

The 3rd and the 4th lines both output "[57][spawn_thread][entry]", I can't understand it and think there should be a "[57][spawn_thread][return]" between them. Could anyone explain it, or maybe is it a DTrace bug?

Nan Xiao
  • 16,671
  • 18
  • 103
  • 164
  • 1
    Note that your output won't necessarily be printed in time-order if you're using a multi-CPU system. With that in mind, do you see a line for [spawn_thread][return] at (or very close to) 02:13:49 _before_ the first line of your excerpt? – Robert Harris Mar 03 '14 at 08:00
  • @rmh: After printing the CPU ID, I got the idea: In multiple-CPU system, the DTrace log will be cached in CPU buffer. So Although the code of a thread is executed in sequence, but the DTrace log may disorder because the code is executed in different CPUs. Is it right? – Nan Xiao Mar 03 '14 at 08:52
  • 1
    Yes, that's almost exactly right (only the _results_ of the script, e.g. `walltimestamp` and `tid`, are cached in the CPU buffer; the log itself is generated by `dtrace(1)` asynchronously). – Robert Harris Mar 03 '14 at 09:28

0 Answers0