I am running rake test
in a Docker container, where I need to limit the container's max-memory. This has been achieved by passing the -m 1800m
flag to docker run
command. Unfortunately, the rake
process seems to keep growing wrt memory usage, and is finally killed mid-way due to some sort of OOM killer. I've tried putting the following in my test_helper.rb
file...
class ActiveSupport::TestCase
teardown :perform_gc
def perform_gc
puts "==> Before GC: #{ObjectSpace.count_objects}"
GC.start(full_mark: true)
puts "==> Before GC: #{ObjectSpace.count_objects}"
end
end
...and am getting this output during my test run.
==> Before GC: {:TOTAL=>1067513, :FREE=>798, :T_OBJECT=>146231, :T_CLASS=>13450, :T_MODULE=>3350, :T_FLOAT=>10, :T_STRING=>448040, :T_REGEXP =>3951, :T_ARRAY=>156744, :T_HASH=>32722, :T_STRUCT=>2162, :T_BIGNUM=>8, :T_FILE=>266, :T_DATA=>113834, :T_MATCH=>20339, :T_COMPLEX=>1, :T_RATIONAL=>59, :T_NODE=>115807, :T_ICLASS=>9741}
==> After GC: {:TOTAL=>1067920, :FREE=>304019, :T_OBJECT=>92774, :T_CLASS=>13431, :T_MODULE=>3350, :T_FLOAT=>10, :T_STRING=>328707, :T_REGEXP=>3751, :T_ARRAY=>107523, :T_HASH=>25206, :T_STRUCT=>2023, :T_BIGNUM=>7, :T_FILE=>11, :T_DATA=>112605, :T_MATCH=>11, :T_COMPLEX=>1, :T_RATIONAL=>59, :T_NODE=>64713, :T_ICLASS=>9719}
... test result of test #1 ....
==> Before GC: {:TOTAL=>1598233, :FREE=>338182, :T_OBJECT=>111209, :T_CLASS=>15057, :T_MODULE=>3481, :T_FLOAT=>10, :T_STRING=>570289, :T_REGEXP=>4836, :T_ARRAY=>219746, :T_HASH=>54358, :T_STRUCT=>12047, :T_BIGNUM=>8, :T_FILE=>12, :T_DATA=>138031, :T_MATCH=>2600, :T_COMPLEX=>1, :T_RATIONAL=>389, :T_NODE=>117993, :T_ICLASS=>9984}
==> After GC: {:TOTAL=>1598233, :FREE=>653201, :T_OBJECT=>103708, :T_CLASS=>14275, :T_MODULE=>3426, :T_FLOAT=>10, :T_STRING=>418825, :T_REGEXP=>3773, :T_ARRAY=>137405, :T_HASH=>39734, :T_STRUCT=>7444, :T_BIGNUM=>7, :T_FILE=>12, :T_DATA=>128923, :T_MATCH=>12, :T_COMPLEX=>1, :T_RATIONAL=>59, :T_NODE=>77590, :T_ICLASS=>9828}
... test result of test #2 ....
==> Before GC: {:TOTAL=>1598233, :FREE=>269630, :T_OBJECT=>114406, :T_CLASS=>14815, :T_MODULE=>3470, :T_FLOAT=>10, :T_STRING=>611637, :T_REGEXP=>4352, :T_ARRAY=>248693, :T_HASH=>58757, :T_STRUCT=>12208, :T_BIGNUM=>8, :T_FILE=>25, :T_DATA=>139671, :T_MATCH=>2288, :T_COMPLEX=>1, :T_RATIONAL=>83, :T_NODE=>108278, :T_ICLASS=>9901}
==> After GC: {:TOTAL=>1598233, :FREE=>635044, :T_OBJECT=>105028, :T_CLASS=>14358, :T_MODULE=>3427, :T_FLOAT=>10, :T_STRING=>429137, :T_REGEXP=>3775, :T_ARRAY=>140654, :T_HASH=>41626, :T_STRUCT=>8085, :T_BIGNUM=>7, :T_FILE=>12, :T_DATA=>129507, :T_MATCH=>15, :T_COMPLEX=>1, :T_RATIONAL=>59, :T_NODE=>77631, :T_ICLASS=>9857}
... test result of test #3 ....
... and so on ....
The value of ObjectSpace.count_objects[:TOTAL]
is constantly growing after each test!
1067920 (after GC is run at the end of 1st test)
5250321 (after GC is run at the end of 18th test)
8631313 (after GC is run at the end of last, but ten, test)
8631313 (this number remains the same for the next 10 tests)
8631313 (after GC is run at the end of last, but three, test)
8631313 (after GC is run at the end of last, but two, test)
8631313 (after GC is run at the end of last, but one, test)
8631721 (after GC is run at the end of last test, after which rake aborts)
I'm also monitoring the process's memory usage via docker stats
and ps aux --sort -rss
. The memory consumption stabilises around the 1.77 GB mark, which is dangerously close to the 1800 mb
limit set for the container. This is also validated by the value of ObjectSpace.count_object[:TOTAL]
not changing for the last 10-15 test before rake is killed/aborted.
The error message in docker logs
at the time of crash/abort/kill is:
PANIC: could not write to log file 00000001000000000000000A at offset 14819328, length 16384: Cannot allocate memory
LOG: unexpected EOF on client connection with an open transaction
LOG: WAL writer process (PID 262) was terminated by signal 6: Aborted
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted; last known up at 2018-02-26 06:35:38 UTC
How do I get to the bottom of this and ensure that my tests run in constant memory?