0

I would like to have a record of how much cpu time each function is using. The caveat here is that I only want to record the time used by the function itself, not the cpu time used by functions that are called by the function and which are also traced by my script. e.g. if both function foo and bar are traced by my script and the total cpu time for function foo is 2000, but function foo calls function bar three times, which costs 500 cpu time each, then I would like to see the following result:

function  cputime   call count
foo           500            1
bar          1500            3

Right now I have the following dtrace script to get the total cpu time per function, but I don't have any leads yet on how to change it such that I get the cpu time results as described above. (Note call count and output formatting is not yet in the script, but those are easy to add once I have the cpu time info that I am after.)

#!/usr/sbin/dtrace -s

pid$1:$2::entry
{
    ++self->call_depth[probefunc];

    self->start[probefunc, self->call_depth[probefunc]] = timestamp;
    self->vstart[probefunc, self->call_depth[probefunc]] = vtimestamp;
}

pid$1:$2::return
/self->start[probefunc, self->call_depth[probefunc]]/
{
    @function_walltime[probefunc] = sum(timestamp - self->start[probefunc, self->call_depth[probefunc]]);
    self->start[probefunc, self->call_depth[probefunc]] = 0;
    @function_cputime[probefunc] = sum(vtimestamp - self->vstart[probefunc, self->call_depth[probefunc]]);
    self->vstart[probefunc, self->call_depth[probefunc]] = 0;

    --self->call_depth[probefunc];
}
wich
  • 16,709
  • 6
  • 47
  • 72

2 Answers2

1

Hope the following script can help:

#!/usr/sbin/dtrace -qs

pid$1:$2::entry
{
    self->vstart[probefunc] = vtimestamp;
}

pid$1:$2::return
{
    this->cputime = vtimestamp - self->vstart[probefunc];
    /* Sub the caller function CPU time */
    @function_cputime[ufunc(ucaller)] = sum(-(this->cputime));
    /* Add the callee function (current function) CPU time */
    @function_cputime[ufunc(uregs[R_PC])] = sum(this->cputime); 
    /* Add the callee function (current function) count */
    @function_count[ufunc(uregs[R_PC])] = sum(1);
}
Nan Xiao
  • 16,671
  • 18
  • 103
  • 164
  • Thanks, this wasn't completely what I was looking for as it fails when I have a call stack A->B->C but am only tracing A and C. However I didn't know about ufunc and uregs[R_PC] which helped me greatly. – wich Apr 14 '14 at 08:44
  • @wich: Why it fails when having a call stack A->B->C? – Nan Xiao Apr 14 '14 at 08:48
  • Because if A, B, C take 2000 1000 500 inclusive the exclusive reported counts will become 2000 -500 and 500 respectively when only A and C are traced, – wich Apr 14 '14 at 08:50
0

This is the program I ended up using:

#!/usr/sbin/dtrace -s
#pragma option quiet

pid$1:$2::entry
/self->start[probefunc] == 0/
{
    this->call_depth = self->call_depth++;
    self->func_pcs[this->call_depth] = uregs[R_PC];

    self->start[probefunc] = timestamp;
    self->vstart[probefunc] = vtimestamp;

    @function_entry_count[ufunc(uregs[R_PC])] = count();
}

pid$1:$2::return
/self->start[probefunc]/
{
    this->call_depth = --self->call_depth;

    this->wall_elapsed = timestamp - self->start[probefunc];
    self->start[probefunc] = 0;
    this->cpu_elapsed = vtimestamp - self->vstart[probefunc];
    self->vstart[probefunc] = 0;

    @function_walltime_inc[ufunc(uregs[R_PC])] = sum(this->wall_elapsed);
    @function_walltime_exc[ufunc(uregs[R_PC])] = sum(this->wall_elapsed);
    @function_cputime_inc[ufunc(uregs[R_PC])] = sum(this->cpu_elapsed);
    @function_cputime_exc[ufunc(uregs[R_PC])] = sum(this->cpu_elapsed);
    @function_return_count[ufunc(uregs[R_PC])] = count();
}       

pid$1:$2::return
/this->call_depth > 0/
{
    this->caller_pc = self->func_pcs[this->call_depth - 1];
    @function_walltime_exc[ufunc(this->caller_pc)] = sum(-(this->wall_elapsed));
    @function_cputime_exc[ufunc(this->caller_pc)] = sum(-(this->cpu_elapsed));
}       

dtrace:::END
{       
    /* normalize to millisecons */
    normalize(@function_walltime_inc, 1000000);
    normalize(@function_walltime_exc, 1000000);
    normalize(@function_cputime_inc, 1000000);
    normalize(@function_cputime_exc, 1000000);

    printf("\n");
    printf("%-60s %21s %21s %25s\n", "", "INCLUSIVE", "EXCLUSIVE", "CALL COUNT");
    printf("%-60s %10s %10s %10s %10s %12s %12s\n",
            "MODULE`FUNCTION", "WALL [ms]", "CPU [ms]", "WALL [ms]", "CPU [ms]", "ENTRY", "RETURN");
    printa("%-60A %@10d %@10d %@10d %@10d %@12d %@12d\n",
            @function_walltime_inc, @function_cputime_inc,
            @function_walltime_exc, @function_cputime_exc,
            @function_entry_count, @function_return_count);
}

Note: I'm tracing both function entry count and return count as for some functions dtrace cannot instrument function returns properly, which completely messes up the call stack and hence the exclusive times. With both counts printed the problematic functions can be identified and if necessary removed from the tracing.

wich
  • 16,709
  • 6
  • 47
  • 72