Pthread id from pthread_self() doesn't match data from dtrace script

Question

I'm using this dtrace script from here to try to find when context switches occur for the threads of a java program.

I'm trying to match the data gathered from the script with trace data gathered from the running program (things like method entry/exit). I get the pthread id of the running thread using a short JNI method that simply returns the value of pthread_self().

The problem i'm having is that the thread id I get from calling pthread_self() is completely different from any thread id I get in the dtrace script. I'm wondering if it's because i'm calling pthread_self() incorrectly since it returns a pointer, however it's been hard to find information about what pthread_t actually is on mac osx.

Obviously your answer tells why you couldn't match up `tid` and `pthread_self()`, but I believe my solution below handles the issue you were trying to figure out. Could you take a look? Thanks! — Dan, Jun 20 '13 at 07:19

score 3 · Answer 1 · answered Sep 15 '09 at 09:34

So i'll answer my own question, the curthread and tid variables in dtrace are the pointer values for the kernal thread structures, to get these values to compare dtrace with user space thread data I had to create a kernel extension to get these internal values for threads in user space.

In general this is a bad idea since it's non-portable, could easily break if the kernel was changed and is probably a security risk. Unfortunately I haven't found another way to achieve what I want.

+1 for finding a solution ... would be even better if you could post it :) — Nikolai Fetissov, Sep 15 '09 at 14:12

score 2 · Answer 2 · answered Sep 06 '09 at 22:58

From /usr/include/pthread.h:

typedef __darwin_pthread_t pthread_t;

then from /usr/include/sys/_types.h:

struct _opaque_pthread_t {
  long __sig;
  struct __darwin_pthread_handler_rec* __cleanup_stack;
  char __opaque[__PTHREAD_SIZE__];
};
typedef struct _opaque_pthread_t* __darwin_pthread_t;

Source code is your friend :)

score 1 · Answer 3 · answered May 29 '13 at 23:01

How about something a bit more elegant using the pid provider, which deals with userland code?

# dtrace -n 'pid$target::pthread_self:return {printf("%p", arg1)}' -c 'java'
dtrace: description 'pid$target::pthread_self:return ' matched 1 probe
dtrace: pid 87631 has exited
CPU     ID                    FUNCTION:NAME
  0  90705              pthread_self:return 1053a7000
  0  90705              pthread_self:return 1054ad000
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  2  90705              pthread_self:return 7fff7b479180
  4  90705              pthread_self:return 10542a000
  4  90705              pthread_self:return 10542a000

Huzzah!

arg1 refers to the return value in the probe, which in this case is a pointer. If you need the stuff it points to, use copyin(arg1, size_of_struct) and cast the result to whatever you think it is (see @Nikolai's post and don't forget you can use #include in DTrace scripts as long as you remember the -C option on the command line). The pid$target provider name expands to pid1234, where 1234 is the process id of the command executed with the -c option - in this case, java.

For more information, check out Brendan Gregg's blog (which is a great general source of dtrace info).

Note that you can interleave this probe with the in-kernel probes you were using before to track context switches - just use thread-local storage (`self->variable = ...`) to store the `pthread_self()` results and reference it later when the context switch is happening. — Dan, May 29 '13 at 23:04

score 0 · Answer 4 · answered Oct 02 '14 at 15:56

On linux, the most reliable way I've found to identify process context switching, is through the command:

pidstat -hluwrt  | grep "processname"

The 'tid' column (#3) is the same as 'gettid()', thus allowing the developer to directly correlate which thread is using CPU and context switching. I suggest that when a thread is spawned for the program to spit out the gettid() value: printf("%lul",gettid()).

The last 2 columns, prior to process command line, are the 'cswtch/s' (voluntary) and 'nvcswtch/s' (non-voluntary) context switch counts, per second.

When the 'cswtch/s' is high (1000's) your process is is cycling through the 'wake' and 'sleep' excessively. You may want to consider some kind of buffer to supply the threads, allowing for longer awake & sleep times. ex: When buffer NOT full the thread stays asleep longer. When the buffer becomes full, the thread is awake until the buffer becomes empty.

When the 'nvswtch/s' is high (1000's), this is a symptom your system is heavily loaded and the individual thread is contending for CPU time. You may want to investigate the server load, quantity of active processes & threads on the server: 'top' or 'htop' are your friends.

I find the following script useful debugging/optimizing process threading (outputs every 20 seconds):

stdbuf -oL pidstat -hluwrt  20 | stdbuf -oL grep -e "processname" -e "^#"

Documentation for gettid: (doc here)
Documentation for pidstat: (doc here)
Documentation for stdbuf: (doc here)

Pthread id from pthread_self() doesn't match data from dtrace script

4 Answers4