3

There is a pprof utility in google-perftools package. It is utility to convert profile files from google-perftools cpuprofiler and heapprofiler into beautiful images: like https://github.com/gperftools/gperftools/tree/master/doc/pprof-test-big.gif and https://github.com/gperftools/gperftools/tree/master/doc/heap-example1.png

The format of pprof's input file is described for cpu profiles here: https://github.com/gperftools/gperftools/tree/master/doc/cpuprofile-fileformat.html

but the format of heap profile input files is not described in the svn.

What is the "heapprofiling" format and how can I generate such file from my code? I already can generate cpuprofiler format, so I interested what are the difference between two formats.

Mitch
  • 3
  • 2
osgx
  • 90,338
  • 53
  • 357
  • 513

1 Answers1

4

Seems the format is not binary as for cpu profiler, but textual:

First line:

 heap profile:   1:   2 [ 3:  4] @ heapprofile

Regex (not full)

 (\d+): (\d+) \[(\d+): (\d+)\] @ ([^/]*)(/(\d+))?)?

where

  • 1 & 2 is "in-use stats"; first number is count of allocations, second is byte count
  • 3 & 4 is "total allocated stats"; first and second with same meaning
  • heapprofile is the type

Then a profile itself follows in lot of lines:

 1: 2 [ 3: 4] @ 0x00001 0x00002 0x00003

where "1: 2" and "3: 4" is of the same meaning as in first line; but only malloced from given callsite; 0x00001 0x00002 is callstack of the callsite.

Then empty line and "MAPPED_LIBRARIES:". From the next line something very like copy of the /proc/pid/maps follows.

osgx
  • 90,338
  • 53
  • 357
  • 513
  • 1
    This is helpful but I'd love to get more details and examples. Is it one line for each allocation and free? Are the lines sorted by time? Are frees indicated somehow? Does the first line indicate the state at the beginning of the time period covered by the .heap file, or the end? I'm trying to find the highwater mark for my program, which seems like it should be easy, but is turning out to be quite difficult. – Bruce Dawson Jul 07 '14 at 17:58
  • Bruce Dawson, you can try to use the plain [glibc's `mtrace`](http://man7.org/linux/man-pages/man3/mtrace.3.html), http://www.gnu.org/s/hello/manual/libc/Allocation-Debugging.html and http://stackoverflow.com/questions/2593284/enable-mtrace-malloc-trace-for-binary-program (how to turn it on). Don't run `mtrace` perl script on output, but parse raw file (`MALLOC_TRACE=file`), `Interpreting the traces` section has example of raw trace, `+` is for malloc, `-` is for free, and `<` and `>` are reallocs. If you want, you can ask question about mtrace format and I'll try to describe it. – osgx Jul 07 '14 at 19:00
  • Thanks for the offer. I stepped through the code watch /proc/PID/smaps and eventually found that my 50 MB allocation was from dlopen. A bit more sleuthing and I found that it was the BSS section. Somebody declared an (unused!!!) 50 MB global variable. It is quite weird that the loader puts the BSS section away from the rest of the .so, and not obviously associated with it. It is also weird that it fully commits it (straight to RSS) memory. On the upside it's an easy fix for a big win. – Bruce Dawson Jul 08 '14 at 18:05