1

This question is about memory allocation. I am using Array in this example because it replicates the behavior of another, more complex problem related to database code in a 3rd party library. I need to understand why memory allocation behavior changes in this way in a "node in container" environment.

I am running node.js (12.18.4, lts-stretch) in a Docker Desktop Community (2.3.0.5, Windows) container. The container has a limit of 2GB of memory. (However, I see the same behavior regardless of how much memory I assign to Docker.)

In node, this statement works as expected:

var a = Array(32 * 1024 * 1024).fill(0);

However, when this statement executes, node starts allocating memory without limit, as if it were stuck in an infinite loop:

var a = Array(32 * 1024 * 1024 + 1).fill(0);

I do not see the above behavior when running node.exe from a Windows PowerShell prompt -- only when running a node container (https://hub.docker.com/_/node).

Why does the memory allocation fail to work correctly at 32MB + 1 elements when running node in the container?

  • The question is about memory allocation. I am using Array in this example because it replicates the behavior of another, more complex problem related to database code in a 3rd party library. I need to understand why memory allocation behavior changes in this way in a "node in container" environment. – disentropic Sep 29 '20 at 18:21
  • Hm, does this also happen if you use `new Array` rather than `Array`? (the general JS object construction algorithm differs depending on whether the `new` keyword is used or not) – Mike 'Pomax' Kamermans Sep 29 '20 at 18:29
  • Same behavior with or without the new keyword. – disentropic Sep 29 '20 at 18:33
  • Also, uname - a is: Linux 38d3a147695e 4.19.76-linuxkit #1 SMP Tue May 26 11:42:35 UTC 2020 x86_64 GNU/Linux – disentropic Sep 29 '20 at 18:35
  • just to confirm, when you test this on Windows, that's with the same version? (e.g. you used [nvm for windows](https://github.com/coreybutler/nvm-windows) to ensure exact version matching?) – Mike 'Pomax' Kamermans Sep 29 '20 at 18:59
  • (Also, there is no v12.8.4, did you mean v12.18.4?) – Mike 'Pomax' Kamermans Sep 29 '20 at 19:02
  • 1
    I can reproduce this on Windows with 12.18.4 without issue - you're creating a filled Array of size `2**25` which is right at the limit of what Node can allocate for an untyped array: Using `2**26` will crash V8, so I suspect that 2**25 is a magic threshold above which Node suddenly needs to do a hell of a lot of memory management because it never expected to need a contiguous stretch of untyped memory that large. (so whoever's library this is, you probably want to file an issue for them to start using typed arrays instead) – Mike 'Pomax' Kamermans Sep 29 '20 at 19:10
  • 12.18.4 now fixed in question. Same versions compared. Thanks for the help - confirmation of a break point at 2^25 in the answer below provides the information for a high confidence workaround. – disentropic Sep 29 '20 at 20:31

1 Answers1

2

V8 developer here. In short: Mike 'Pomax' Kamermans' guess is spot on.

32 * 1024 * 1024 == 2**25 is the limit up to which new Array(n) will allocate a contiguous ("C-like") backing store of length n. Filling such a backing store with zeroes is relatively fast, and requires no further allocations.

With a longer length, V8 will create the array in "dictionary mode"; i.e. its backing store will be an (initially empty) dictionary. Filling this dictionary is slower, firstly because dictionary accesses are a bit slower, and secondly because the dictionary's backing store needs to be grown a couple of times, which means copying over all existing elements. Frankly, I'm surprised that the array stays in dictionary mode; in theory it should switch to flat-array mode when it reaches a certain density. Interestingly, when I run var a = new Array(32 * 1024 * 1024 + 1); for (var i = 0; i < a.length; i++) a[i] = 0; then that's what happens. Looks like the implementation of fill could be improved there; on the other hand I'm not sure how relevant this case is in practice...

Side notes:

at 32MB + 1 elements

we're not talking about 32 MB here. Each entry takes 32 or 64 bits (depending on platform and pointer compression), so it's either 128 or 256 MB for 2**25 entries.

node starts allocating memory without limit, as if it were stuck in an infinite loop

The operation actually does terminate after a while (about 7 seconds on my machine), and memory allocation peaks at a little over 900 MB. The reason is that if you actually use all entries, then dictionary mode is significantly less space efficient than a flat array backing store, because each entry needs to store its index, its attributes, and the value itself, plus dictionaries by their nature need some unused capacity to avoid overly many hash collisions.

I am using Array in this example because it replicates the behavior of another, more complex problem

Given how specific the behavior seen here is to arrays, I do wonder how accurately this simplified case reflects behavior you're seeing elsewhere. If the real code you have does not allocate and fill huge arrays, then whatever is going on there is probably something else.

jmrk
  • 34,271
  • 7
  • 59
  • 74
  • Thanks - understanding there is a break point at 2^25 helps. I used MB as shorthand for (* 1024 * 1024); by "elements" I was referring to data elements, not bytes. I am guessing I am seeing the memory exhaustion in Docker because of inefficiencies in the virtualization. – disentropic Sep 29 '20 at 20:25