I have a node (v0.10.45) app that has been working well for a long time without problems (no memory leak, code logic clean).
It recently started behaving weirdly, pausing in arbitrary locations. The pauses are apparently NOT connected to any particular line of code/data. So while the pauses happen a lot, they happen at different places, making them very hard to reproduce exactly!
Some requests now take >10,000ms and time out because of this, even though normal response time is <50ms. The app is unusable.
The app creates a lot of small objects, so my hunch was these the random spikes are somehow connected to garbage collection. So I added --trace_gc --trace_gc_verbose
to node CLI params, and see this around the laggy bits:
[24379] All spaces, used: 1328265 KB, available: 73223 KB, committed: 1419167 KB
[24379] Total time spent in GC : 27640 ms
[24379] 283287 ms: Mark-sweep 1363.9 (1458.1) -> 1335.8 (1458.1) MB, 4208 ms [allocation failure] [promotion limit reached].
[24379] Memory allocator, used: 1493104 KB, available: 6032 KB
[24379] New space, used: 1 KB, available: 16382 KB, committed: 32768 KB
[24379] Old pointers, used: 693370 KB, available: 55840 KB, committed: 749270 KB
[24379] Old data space, used: 482911 KB, available: 11581 KB, committed: 494997 KB
[24379] Code space, used: 775 KB, available: 497 KB, committed: 1992 KB
[24379] Map space, used: 333 KB, available: 802 KB, committed: 1135 KB
[24379] Cell space, used: 25 KB, available: 102 KB, committed: 128 KB
[24379] Large object space, used: 190467 KB, available: 4991 KB, committed: 191377 KB
[24379] All spaces, used: 1367884 KB, available: 85208 KB, committed: 1471668 KB
[24379] Total time spent in GC : 31848 ms
[24379] 287770 ms: Mark-sweep 1363.6 (1470.2) -> 1345.4 (1470.2) MB, 4313 ms [Runtime::PerformGC] [scavenge might not succeed].
[24379] Memory allocator, used: 1505472 KB, available: 0 KB
[24379] New space, used: 1 KB, available: 16382 KB, committed: 32768 KB
[24379] Old pointers, used: 690545 KB, available: 58281 KB, committed: 749270 KB
[24379] Old data space, used: 483262 KB, available: 11367 KB, committed: 494997 KB
[24379] Code space, used: 775 KB, available: 1216 KB, committed: 1992 KB
[24379] Map space, used: 333 KB, available: 802 KB, committed: 1135 KB
[24379] Cell space, used: 25 KB, available: 102 KB, committed: 128 KB
[24379] Large object space, used: 202755 KB, available: 0 KB, committed: 203730 KB
[24379] All spaces, used: 1377698 KB, available: 88154 KB, committed: 1484021 KB
[24379] Total time spent in GC : 36161 ms
[24379] 292175 ms: Mark-sweep 1373.2 (1482.3) -> 1354.9 (1482.3) MB, 4238 ms [Runtime::PerformGC] [scavenge might not succeed].
[24379] Memory allocator, used: 1517840 KB, available: 0 KB
[24379] New space, used: 1 KB, available: 16382 KB, committed: 32768 KB
[24379] Old pointers, used: 687696 KB, available: 61354 KB, committed: 749270 KB
[24379] Old data space, used: 483549 KB, available: 11228 KB, committed: 494997 KB
[24379] Code space, used: 759 KB, available: 1232 KB, committed: 1992 KB
[24379] Map space, used: 333 KB, available: 802 KB, committed: 1135 KB
[24379] Cell space, used: 25 KB, available: 102 KB, committed: 128 KB
[24379] Large object space, used: 215043 KB, available: 0 KB, committed: 216083 KB
[24379] All spaces, used: 1387407 KB, available: 91103 KB, committed: 1496374 KB
[24379] Total time spent in GC : 40399 ms
[24379] 296685 ms: Mark-sweep 1370.7 (1482.3) -> 1355.4 (1482.3) MB, 4359 ms [Runtime::PerformGC] [scavenge might not succeed].
[24379] Memory allocator, used: 1517840 KB, available: 0 KB
[24379] New space, used: 1 KB, available: 16382 KB, committed: 32768 KB
[24379] Old pointers, used: 687918 KB, available: 60586 KB, committed: 749270 KB
[24379] Old data space, used: 483840 KB, available: 11097 KB, committed: 494997 KB
[24379] Code space, used: 764 KB, available: 1227 KB, committed: 1992 KB
[24379] Map space, used: 333 KB, available: 802 KB, committed: 1135 KB
[24379] Cell space, used: 25 KB, available: 102 KB, committed: 128 KB
[24379] Large object space, used: 215043 KB, available: 0 KB, committed: 216083 KB
[24379] All spaces, used: 1387926 KB, available: 90199 KB, committed: 1496374 KB
[24379] Total time spent in GC : 44758 ms
[24379] 301230 ms: Mark-sweep 1383.1 (1494.3) -> 1364.9 (1494.3) MB, 4375 ms [Runtime::PerformGC] [scavenge might not succeed].
Any idea how to read this?
What is the problem, how to fix it?