0

I have an application I'm profiling using perf and I find the results when using perf report are not consistent, and I can't discern the pattern.

I start the application and profile it by pid for 60 seconds:

perf record -p <pid> -o <file> sleep 60

And when I pull the results in with perf report -i <file>, sometimes I see a "+" in the far left column that allows me to drill down into the function call trees when I press ENTER, and sometimes that "+" is not there. It seems to be dependent on some property of the recorded file, in that I have a collection of recorded files, some which allow this drill down and some which do not.

Any suggestions on how to get consistent behavior here would be appreciated.

John S
  • 3,035
  • 2
  • 18
  • 29
  • Not consistent how? – njuffa Nov 06 '17 at 20:39
  • The "+" in the leftmost column for drilling down into the call tree is not always there. I would like it to be. – John S Nov 06 '17 at 20:39
  • Try running for longer than 60 seconds, e.g. five minutes. – njuffa Nov 06 '17 at 20:43
  • I've tried different sample lengths from 10s to 400s and the results are inconsistent. Sometimes it's there, sometimes not. – John S Nov 06 '17 at 20:52
  • 1
    A working hypothesis is: (1) This is a *sampling* profiler which will 'hit' different functions statistically, leading to different call trees if some functions are not 'hit' on a particular run (2) Sampling isn't starting at exactly the same point in each run. If there is a setting for changing the sampling frequency, try that (ideally chose a sampling frequency relatively prime to the time-slice frequency, to minimize the "stampeding herd" effect). – njuffa Nov 06 '17 at 21:02

1 Answers1

0

The default event being measured by perf record is cpu-cycles. (Or depending on the machine, sometimes cpu-cycles:p or cpu-cycles:pp)

Are you sure your application is not sleeping a lot? Does it consume a lot of cpu cycles?

Try a perf measurement on something that stresses the CPU by doing a lot of computations:

$ apt-get install stress
$ perf record -e cpu-cycles --call-graph fp stress --cpu 1 --timeout 5
$ perf report

Subsequent runs should then show more or less similar results.

In case your program is CPU intensive, and call stacks do differ a lot between runs, then you may want to look at the --call-graph option, as perf can record call-graphs with different methods:

  • fp (function pointer)
  • lbr (last branch record)
  • dwarf

Maybe different methods give better results.

Bram
  • 7,440
  • 3
  • 52
  • 94