We want to pull stack traces from a running process. Pulling stack traces directly with gstack is not an option and using a gdbserver works, but is quite slow due to the network. We were curious if we can pull a core dump of the process with
gdb --ex "attach $PID" --ex "gcore core_dump_file" --ex "q"
copy the core dump to a container with the compiled executable and then analyze the core dumps inside the container with
gdb $PATH_TO_EXECUTABLE core_dump_file --ex "thread apply all bt" --ex "q"
The core dump can reach sizes of hundreds of gigabytes in production, so we would like to filter the coredump as much as possible.
The memory sections written to the core dump can filtered by changing the coredump_filter
file at /proc/$PID/coredump_filter
(see the man page for core). As long as bit 0 (Dump anonymous private mappings) is set, everything works fine, we get the expected backtraces. It can be set with
echo 0x1 > "proc/$PID/coredump_filter
Unfortunately, the anonymous private mappings are very large and make up about 75% of the unfiltered coredump, e.g. in one of our test cases, the default coredump with 0x33
was 5 GB, and the filtered coredump with 0x1
was still 3.8 GB large.
If we filter as much as possible by setting all bits to zero with
echo 0x0 > "proc/$PID/coredump_filter
the resulting coredump becomes very small with a size of only one MB (down from several gigabytes of size), however using thread apply all bt
fails with
Thread 1 (LWP 2653):
#0 0x00007fae7d4218fd in ?? ()
Backtrace stopped: Cannot access memory at address 0x7ffe21387490
With the fully filtered coredump, I can see that the gdb does not know which shared libraries to load:
(gdb) info shared
No shared libraries loaded at this time.
What does gdb actually need to get the backtraces? Is there a way to filter everything gdb does not need for the back traces? Is there a better way to get the backtraces (does not necessarily have to use gdb)?