1

I'm fascinated by the ability of 'perf' to record call graphs and am trying to understand how to use it to understand a new code base.

I compiled the code in debug mode, and ran unit tests using the following command:

perf record --call-graph dwarf make test

This creates a 230 meg perf.data. I then write out the call graph

perf report --call-graph --stdio > callgraph.txt

This creates a 50 meg file.

Ideally, I would only like to see code belonging to the project, not kernel code, system calls, c++ standard libraries, even boost and whatever other third party software. Currently I see items like __GI___dl_iterate_phdr, _Unwind_Find_FDE, etc.

I love the flamegraph project. However, that visualization isn't good for code comprehension. Are there any other projects, write-ups, ideas, which might be helpful?

Shahbaz
  • 10,395
  • 21
  • 54
  • 83
  • Try to filter your report by "dso" of the application. And any xref tool will be more useful to understand new code base (cscope, lxr, http://osxr.org, code.metager.de/source, GUI IDEs) – osgx Jan 09 '16 at 04:13
  • `perf report -g` for huge application should not be dumped to external file; it will work without redirection with interactive perf report TUI interface. Also try https://github.com/jrfonseca/gprof2dot script to visualize perf report call-graph output as picture (graph); and also Brendan D. Gregg's interactive svg/js [FlameGraphs](http://www.brendangregg.com/flamegraphs.html) (he often shows many megabyte raw dumps of report as lot of A4 pages) - instruction for the perf: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#perf – osgx May 30 '17 at 05:19

1 Answers1

2

perf report -g for huge application should not be dumped to external file as too verbose. Collected perf.data (with -g) will work without file redirection with interactive perf report TUI interface. You may disable callgraph reporting to find functions took most time with perf record without -g or perf report --no-children.

There is gprof2dot script (https://github.com/jrfonseca/gprof2dot) to visualize lagre perf report call-graphs as compact picture (graph).

There is also Brendan D. Gregg's interactive FlameGraphs in svg/js; and he often notes in presentations that perf report -g output shows many megabyte raw dumps of report as lot of A4 pages. There is usage instruction for the perf: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#perf:

# git clone https://github.com/brendangregg/FlameGraph  # or download it from github
# cd FlameGraph
# perf record -F 99 -g -- ../command
# perf script | ./stackcollapse-perf.pl > out.perf-folded
# ./flamegraph.pl out.perf-folded > perf-kernel.svg

PS: Why you are profiling make process? Try to select some test and profile only them. Use lower profile frequency to get smaller perf.data file. Also disable kernel-mode samples with :u suffix of default event "cycles": perf record -F 99 -g -e cycles:u -- ../command

osgx
  • 90,338
  • 53
  • 357
  • 513