Yes, there is special per-thread monitoring which allows to read perf counters from within userspace. See manual page for perf_event_open(2)
Since perf
supports only L1i, L1d, and last-level cache events, you'll need to use PERF_EVENT_RAW
mode and use numbers from manual onto your CPU.
To implement a profiling, you'll need to setup sample_interval
, poll
/select
fd or wait for SIGIO
signal, and when it happens, read sample and instruction pointer from it. You'll may latter try to resolve returned instruction pointers to a function names using a debugger like GDB.
Another option is to use SystemTap. You'll need empty implementation of start|end_profiling()
, just to enable SystemTap profiling with something like that:
global traceme, prof;
probe process("/path/to/your/executable").function("start_profiling") {
traceme = 1;
}
probe process("/path/to/your/executable").function("end_profiling") {
traceme = 0;
}
probe perf.type(4).config(/* RAW value of perf event */).sample(10000) {
prof[usymname(uaddr())] <<< 1;
}
probe end {
foreach([sym+] in prof) {
printf("%16s %d\n", sym, @count(prof[sym]));
}
}