11

I'm using perf record -a --call-graph dwarf -p XXX sleep 1 to record some function calls then perf report to view that data, however it would be very helpful if I could also see source line numbers to know exactly where each function call was made. For example:

-   4.18%  testbinary  testbinary                [.] malloc 
   - malloc 
      - 99.57% operator new(unsigned long)
         + 7.28% MyFunction()

I want to know where exactly in MyFunction() are those new operators called (without me guessing by looking at the whole function source code)

P.S.: Binary is compiled with -m64 -O2 -ggdb3

QuantumBlack
  • 1,549
  • 11
  • 27
  • 1
    How did you compile your code? Lines of source are reported in "annotate" mode, focus on the self time of MyFunction and press "a" in interactive perf report or use `perf annotate` command (with some symbol `-s=MyFunction`). – osgx Jul 02 '17 at 02:12
  • 1
    @osgx going into annotate mode looks like what I need, however it seems I can only annoatate the topmost function (in this case `malloc`). Even if I go with the cursor on the `MyFunction` (or even deeper), when I annotate it I get the annotation for `malloc` and NOT for the selected function. When I select my function and press the right arrow key, I see the option `Annotate MyFunction()`, but when I click on that option no luck, I get the same annotation, `malloc` one P.S. Compiling with `-m64 -O2 -ggdb3` – QuantumBlack Jul 02 '17 at 13:09
  • 1
    Shocker, just find the MyFunction as topmost; when it has the "self" events (you can use search with / button, press / then enter part of MyFunction name. Press / and enter to see all functions again). What is your kernel version (perf usually has the same version, and there can be some UI problem with older perf)? – osgx Jul 02 '17 at 16:38
  • @osgx that worked, however the result is not really the expected one. Checking the function I am interested in as topmost function will show me usage relative to total function time. What I need is to know where those `new operators` are happening in `MyFunction()`. If I check the topmost function as you explained, all I get is `anotherFunction("someString");` `0.12% lea -0xc0(%rbp),%rdx` `callq std::basic_string – QuantumBlack Jul 02 '17 at 17:37
  • 0.12% is relative of this instruction to this function time: annotate mode does not have info about child functions (callees). This new can be some temporary std::string allocation to call AnotherFunction (can you add to post assembly fragment and source fragment)? – osgx Jul 02 '17 at 18:14
  • That's exactly what it is, temporary string allocations, but the idea was to have a list or something of these locations without manually going through the code and find them by manual search. This was a simple example, but complex code will be much harder to trace – QuantumBlack Jul 03 '17 at 03:00

2 Answers2

8

Fragments (full lines) of source code are printed by perf in annotate mode (man page; relevant part of The Perf Tutorial). Use perf annotate -s=MyFunction or in perf report scroll down to the subtree where your MyFunction is root of tree (line where self time is reported; you can use / command to search for it) and then select a button (or Enter then Annotate "MyFunction").

Source code and its lines should be visible near assembly lines in Annotate mode. http://man7.org/linux/man-pages/man1/perf-annotate.1.html

This command reads the input file and displays an annotated version of the code. If the object file has debug symbols then the source code will be displayed alongside assembly code.

   -l, --print-line
       Print matching source lines (may be slow).
  --source
      Interleave source code with assembly code. Enabled by default,
       disable with `--no-source`.
   -s, --symbol=<symbol>
       Symbol to annotate.

Perf report may use srclines in sorting (--sort= option) but instructions are unclear. Its man page documents --source option too, but apparently it is used only in Annotate some_function mode: http://man7.org/linux/man-pages/man1/perf-report.1.html

  --source
       Interleave source code with assembly code. Enabled by default,
       disable with --no-source.
osgx
  • 90,338
  • 53
  • 357
  • 513
8

I accidentally found it loosely documented in perf script, but it applies to other commands as well: -F option accepts srcline. So you can do -F+srcline to add the line number to existing columns.

Example: perf report -g fractal -F+period,srcline

Samples: 22K of event 'cycles:u', Event count (approx.): 13031011295
  Children      Self        Period  Source:Line                           Command  Shared Object        Symbol
+   99.98%    38.76%    5051224000  test.cpp:7                            a        a                    [.] fib
+   96.42%     0.00%             0  _start+94372992700461                 a        a                    [.] _start
+   96.42%     0.00%             0  __libc_start_main+140304673091826     a        libc-2.29.so         [.] __libc_start_main
+   96.42%     0.00%             0  test.cpp:13                           a        a                    [.] main
+   21.47%    21.47%    2797741850  test.cpp:8                            a        a                    [.] fib
+   16.69%    16.69%    2174469736  test.cpp:4                            a        a                    [.] fib
+   16.37%    16.36%    2132462705  test.cpp:6                            a        a                    [.] fib
+    6.69%     6.69%     871128215  test.cpp:5                            a        a                    [.] fib
Hi-Angel
  • 4,933
  • 8
  • 63
  • 86
  • What version is that? `perf version 3.13.11-ckt39` doesn't have an -F option – QuantumBlack May 06 '19 at 13:43
  • @Shocker wow, 3.13 is very old, 5 years is. I'm using `5.0` which is latest stable, but I imagine the option appeared much earlier. **edit**: actually, as of yesterday latest stable is `5.1`, but whatever. – Hi-Angel May 06 '19 at 15:02
  • Max I can check on my current kernel version is 4.x something. While it has the option I still can't see the line numbers, but judging from your example this is the solution. Thank you. Btw, did you use any special compile flags? – QuantumBlack May 07 '19 at 08:59
  • Nope, I didn't. I asked on IRC when the option was added, will reply back when I get an answer. – Hi-Angel May 07 '19 at 09:35
  • While searching for it, I stumbled upon [this blog post](https://www.kdab.com/improved-handling-inlined-frames-linux-perf-report/), which describes one more way: `perf report -g srcline`. What it does with regard to the question, is showing source lines when you expand a call chain *(for some reason at top level source lines not shown)*. – Hi-Angel May 07 '19 at 09:43
  • 1
    "Btw, did you use any special compile flags": well, I didn't use anything besides the `-g` option, that is. It's no special, you just not gonna have debugging information otherwise, but I thought it may be worth to mention explicitly. I usually use `-g3` as it adds even more stuff, but for line numbers just `-g` is enough. – Hi-Angel May 07 '19 at 09:50
  • 1
    Sorry, you were correct the whole time, during these last tests I forgot about the `-g` compile option (I have it on my main app). I see lines too now... btw the exact version I was able to test was `4.4.90`, it was probably added in version 4? Either way – QuantumBlack May 07 '19 at 10:56