linux perf report inconsistent behavior

Question

I have an application I'm profiling using perf and I find the results when using perf report are not consistent, and I can't discern the pattern.

I start the application and profile it by pid for 60 seconds:

perf record -p <pid> -o <file> sleep 60

And when I pull the results in with perf report -i <file>, sometimes I see a "+" in the far left column that allows me to drill down into the function call trees when I press ENTER, and sometimes that "+" is not there. It seems to be dependent on some property of the recorded file, in that I have a collection of recorded files, some which allow this drill down and some which do not.

Any suggestions on how to get consistent behavior here would be appreciated.

The "+" in the leftmost column for drilling down into the call tree is not always there. I would like it to be. — John S, Nov 06 '17 at 20:39
I've tried different sample lengths from 10s to 400s and the results are inconsistent. Sometimes it's there, sometimes not. — John S, Nov 06 '17 at 20:52
A working hypothesis is: (1) This is a *sampling* profiler which will 'hit' different functions statistically, leading to different call trees if some functions are not 'hit' on a particular run (2) Sampling isn't starting at exactly the same point in each run. If there is a setting for changing the sampling frequency, try that (ideally chose a sampling frequency relatively prime to the time-slice frequency, to minimize the "stampeding herd" effect). — njuffa, Nov 06 '17 at 21:02

Bram · Answer 1 · 2017-11-07T01:00:32.610

The default event being measured by perf record is cpu-cycles. (Or depending on the machine, sometimes cpu-cycles:p or cpu-cycles:pp)

Are you sure your application is not sleeping a lot? Does it consume a lot of cpu cycles?

Try a perf measurement on something that stresses the CPU by doing a lot of computations:

$ apt-get install stress
$ perf record -e cpu-cycles --call-graph fp stress --cpu 1 --timeout 5
$ perf report

Subsequent runs should then show more or less similar results.

In case your program is CPU intensive, and call stacks do differ a lot between runs, then you may want to look at the --call-graph option, as perf can record call-graphs with different methods:

fp (function pointer)
lbr (last branch record)
dwarf

Maybe different methods give better results.

linux perf report inconsistent behavior

1 Answers1