My golang application is deployed in a docker container with 1GB RAM, but over the course of time, the process throws OOM.
I thought it would be some memory leak issue, but after analyzing the heap profile of the process from pprof, something seems off.
### Time T1 ### Time T2 ### Time T3
## runtime.MemStats ## runtime.MemStats ## runtime.MemStats
# Sys = 293902584 # Sys = 432449784 # Sys = 570800376
# HeapAlloc = 47299656 # HeapAlloc = 63375376 # HeapAlloc = 68294696
# HeapSys = 263323648 # HeapSys = 397541376 # HeapSys = 531496960
# HeapIdle = 175882240 # HeapIdle = 297140224 # HeapIdle = 431710208
# HeapInuse = 87441408 # HeapInuse = 100401152 # HeapInuse = 99786752
# HeapReleased = 153509888 # HeapReleased = 297140224 # HeapReleased = 431677440
My understanding is as follows:
- HeapAlloc (heap space being used by go process currently) is always less than 70 MBs
- HeapSys is the space obtained from the OS. Should this value decrease when Heap is released by the GC?
- I guess HeapInUse is also Okay in this case as it has some extra memory that is allocated to objects. This memory can be used by the process to allocate memory to new objects without asking from OS.
- heapRelease - this field says that this much memory is released by the GC
In our case, the HeapSys keeps on growing till 1GB limit is reached and OOM is thrown.
My question is, shouldn't space from HeapSys be reduced after space is released by GC? Because this is not happening. The top
command output is in sync with HeapSys and suggests almost equivalent (of HeapSys) memory is being used by the process. Or I am missing something here?
Note: We are using Go 1.13
Edit: Logs for OOM
2022-06-10T08:03:19.679701-06:00 myserver kernel: rbace-server invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
2022-06-10T08:03:19.679824-06:00 myserver kernel: CPU: 11 PID: 3622 Comm: rbace-server Kdump: loaded Tainted: G W ------------ T 3.10.0-1160.59.1.el7.x86_64 #1
2022-06-10T08:03:19.679865-06:00 myserver kernel: Hardware name: Dell Inc. PowerEdge R740xd/0DY2X0, BIOS 2.10.0 11/12/2020
2022-06-10T08:03:19.679905-06:00 myserver kernel: Call Trace:
2022-06-10T08:03:19.679945-06:00 myserver kernel: [<ffffffffa37865b9>] dump_stack+0x19/0x1b
2022-06-10T08:03:19.679986-06:00 myserver kernel: [<ffffffffa3781658>] dump_header+0x90/0x229
2022-06-10T08:03:19.680027-06:00 myserver kernel: [<ffffffffa329da38>] ? ep_poll_callback+0xf8/0x220
2022-06-10T08:03:19.680068-06:00 myserver kernel: [<ffffffffa31c1fe6>] ? find_lock_task_mm+0x56/0xc0
2022-06-10T08:03:19.680108-06:00 myserver kernel: [<ffffffffa323d2d8>] ? try_get_mem_cgroup_from_mm+0x28/0x60
2022-06-10T08:03:19.680146-06:00 myserver kernel: [<ffffffffa31c254d>] oom_kill_process+0x2cd/0x490
2022-06-10T08:03:19.680185-06:00 myserver kernel: [<ffffffffa32416cc>] mem_cgroup_oom_synchronize+0x55c/0x590
2022-06-10T08:03:19.680222-06:00 myserver kernel: [<ffffffffa3240b30>] ? mem_cgroup_charge_common+0xc0/0xc0
2022-06-10T08:03:19.680261-06:00 myserver kernel: [<ffffffffa31c2e34>] pagefault_out_of_memory+0x14/0x90
2022-06-10T08:03:19.680301-06:00 myserver kernel: [<ffffffffa377fb95>] mm_fault_error+0x6a/0x157
2022-06-10T08:03:19.680337-06:00 myserver kernel: [<ffffffffa37948d1>] __do_page_fault+0x491/0x500
2022-06-10T08:03:19.680379-06:00 myserver kernel: [<ffffffffa3794975>] do_page_fault+0x35/0x90
2022-06-10T08:03:19.680420-06:00 myserver kernel: [<ffffffffa3790778>] page_fault+0x28/0x30
2022-06-10T08:03:19.680717-06:00 myserver kernel: Memory cgroup out of memory: Kill process 85344 (rbace-server) score 1035 or sacrifice child
2022-06-10T08:03:19.680770-06:00 myserver kernel: Killed process 3518 (rbace-server), UID 0, total-vm:2506864kB, anon-rss:1115028kB, file-rss:0kB, shmem-rss:0kB
2022-06-10T08:03:19.906133-06:00 myserver kernel: XFS (dm-46): Unmounting Filesystem
2022-06-10T08:03:19.939645-06:00 myserver kernel: device-mapper: ioctl: remove_all left 100 open device(s)