3

We believe our go app has a memory leak.

In order to find out what's going on, we are trying with pprof.

We are having a hard time though understanding the reads. When connecting to go tool pprof http://localhost:6060/debug/pprof/heap?debug=1, a sample output is

Entering interactive mode (type "help" for commands, "o" for options)
(pprof) text
Showing nodes accounting for 17608.45kB, 100% of 17608.45kB total
Showing top 10 nodes out of 67
      flat  flat%   sum%        cum   cum%
12292.12kB 69.81% 69.81% 12292.12kB 69.81%  github.com/acct/repo/vendor/github.com/.../funcA /../github.com/acct/repo/vendor/github.com/../fileA.go
 1543.14kB  8.76% 78.57%  1543.14kB  8.76%  github.com/acct/repo/../funcB /../github.com/acct/repo/fileB.go
 1064.52kB  6.05% 84.62%  1064.52kB  6.05%  github.com/acct/repo/vendor/github.com/../funcC /../github.com/acct/repo/vendor/github.com/fileC.go
  858.34kB  4.87% 89.49%   858.34kB  4.87%  github.com/acct/repo/vendor/golang.org/x/tools/imports.init /../github.com/acct/repo/vendor/golang.org/x/tools/imports/zstdlib.go
  809.97kB  4.60% 94.09%   809.97kB  4.60%  bytes.makeSlice /usr/lib/go/src/bytes/buffer.go
  528.17kB  3.00% 97.09%   528.17kB  3.00%  regexp.(*bitState).reset /usr/lib/go/src/regexp/backtrack.go

(Please forgive the clumsy obfuscation)

We interpret funcA to be consuming nearly 70% of memory - but this being around 12MB.

Now top though shows:

top - 18:09:44 up  2:02,  1 user,  load average: 0,75, 0,56, 0,38
Tasks: 166 total,   1 running, 165 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3,7 us,  1,6 sy,  0,0 ni, 94,3 id,  0,0 wa,  0,0 hi,  0,3 si,  0,0 st
KiB Mem : 16318684 total, 14116728 free,  1004804 used,  1197152 buff/cache
KiB Swap:        0 total,        0 free,        0 used. 14451260 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                              
 4902 me     20   0  1,371g 0,096g 0,016g S  12,9  0,6   1:58.14 mybin

which suggests 1.371 GB of memory used....where is it gone???

Also, pprof docs are quite frugal. We are having difficulties even to understand how it should be used. Our binary is a daemon. For example:

  1. If we start a reading with go tool pprof http://localhost:6060/debug/pprof/heap, is this a one time reading at this particular time or an aggregate over time?
  2. Sometimes hitting text later again in interactive mode seems to report the same values. Are we actually looking at the same values? Do we need to restart go tool pprof... to get fresh values?
  3. Is it a reading of the complete heap, or of some specific go routine, of a specific point in the stack....???

Finally, is this interpretation correct (from http://localhost:6060/debug/pprof/):

/debug/pprof/

profiles:
0   block
64  goroutine
45  heap
0   mutex
13  threadcreate

The binary has 64 open go routines and a total of 45MB of heap memory?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
transient_loop
  • 5,984
  • 15
  • 58
  • 117
  • 1
    A memory leak is almost always created by goroutines that don't return, and profiling your allocations isn't going to tell you where goroutines didn't return. The first thing to look at would be a stack trace. – JimB Dec 21 '17 at 23:28
  • thanks @JimB, sounds like wise advice – transient_loop Dec 21 '17 at 23:33
  • 3
    1.371 GB is the virtual memory, not the real memory usage of your binary. For more info check out https://serverfault.com/questions/138427/top-what-does-virtual-memory-size-mean-linux-ubuntu – A.Essam Dec 22 '17 at 08:52
  • 1
    Note the line `Showing nodes accounting for 17608.45kB, 100% of 17608.45kB total`. A remote profile like this takes samples for a certain period, I believe the default is 30 seconds. It will only show memory allocations *during the monitoring period*; you won't see a total equal to the total memory usage of the application. If it kept a record of every allocation since the app started, memory usage would be much, much higher just to store it. – Adrian Dec 22 '17 at 14:10
  • this is a very important insight I get from this @Adrian, thank you very much – transient_loop Dec 22 '17 at 14:13
  • @Adrian do you know how I could get more samples over time? Is querying via `top` inside the `go tool pprof` process ok or should I restart `go tool pprof` several times? – transient_loop Dec 22 '17 at 14:19
  • 1
    Per `go tool pprof -help`, you can pass `-seconds Duration for time-based profile collection` to control how long it gathers samples for profiling. – Adrian Dec 22 '17 at 15:39

0 Answers0