0

I'm writing a script that appends JSON to a write stream using Node's archiver package. archiver then tarballs and gzips the write stream to a local file on disk.

When I run this script however, the process' memory consumption can get up to 2GB! Without using the gzip: true flag in archiver (that is, just using archiver to append to stream and tar), the script only uses 300MB of memory.

I worked around this by spawning a child process to gzip the tarball, but I'd love to know two things:

  1. Why is archiver and by extension, zlib, moving the whole tarball to memory for compression?
  2. Is there a way to work around this without exec-ing to the shell?

Thanks!

  • How do you know that the whole tarball is in memory? Did you test to see if the memory usage is proportional to the size of the data being archived? – Mark Adler Feb 25 '23 at 20:41
  • @MarkAdler The unzipped archive is 1.7G and the node process without setting gzip the gzip flag in `archiver('tar', { gzip: true })` takes around 350M. In total that's about the 2.1G of memory that the process is taking with gzip. – brendandevwork Feb 26 '23 at 17:51
  • That evidence is circumstantial. You need to try an archive twice the size, and see if the additional memory consumption approximately doubles. – Mark Adler Feb 26 '23 at 18:33
  • I tried it with a tar archive of 3.7G and the memory consumption went up proportionally. Again when I remove the `gzip` flag from `archiver` and spawn a child process to gzip in the shell the memory consumption approximately thirds. Do you know why this may be? – brendandevwork Feb 27 '23 at 18:16
  • That is odd. The zlib library used by archiver is designed to be streaming, and there is nothing obviously wrong in the [archiver code](https://github.com/archiverjs/node-archiver/blob/master/lib/plugins/tar.js). You should file an issue on the [archiver github site](https://github.com/archiverjs/node-archiver/issues). – Mark Adler Feb 27 '23 at 18:58

0 Answers0