Node zlib crazy memory consumption

Question

I'm writing a script that appends JSON to a write stream using Node's archiver package. archiver then tarballs and gzips the write stream to a local file on disk.

When I run this script however, the process' memory consumption can get up to 2GB! Without using the gzip: true flag in archiver (that is, just using archiver to append to stream and tar), the script only uses 300MB of memory.

I worked around this by spawning a child process to gzip the tarball, but I'd love to know two things:

Why is archiver and by extension, zlib, moving the whole tarball to memory for compression?
Is there a way to work around this without exec-ing to the shell?

Thanks!

How do you know that the whole tarball is in memory? Did you test to see if the memory usage is proportional to the size of the data being archived? — Mark Adler, Feb 25 '23 at 20:41
@MarkAdler The unzipped archive is 1.7G and the node process without setting gzip the gzip flag in `archiver('tar', { gzip: true })` takes around 350M. In total that's about the 2.1G of memory that the process is taking with gzip. — brendandevwork, Feb 26 '23 at 17:51
That evidence is circumstantial. You need to try an archive twice the size, and see if the additional memory consumption approximately doubles. — Mark Adler, Feb 26 '23 at 18:33
I tried it with a tar archive of 3.7G and the memory consumption went up proportionally. Again when I remove the `gzip` flag from `archiver` and spawn a child process to gzip in the shell the memory consumption approximately thirds. Do you know why this may be? — brendandevwork, Feb 27 '23 at 18:16
That is odd. The zlib library used by archiver is designed to be streaming, and there is nothing obviously wrong in the [archiver code](https://github.com/archiverjs/node-archiver/blob/master/lib/plugins/tar.js). You should file an issue on the [archiver github site](https://github.com/archiverjs/node-archiver/issues). — Mark Adler, Feb 27 '23 at 18:58

Node zlib crazy memory consumption

0 Answers0