I have an incredibly large JSON file (several gigabytes – too large to fit in a JS string) that I'm trying to GZip and upload to S3
Currently I have the following code
import { stringifyStream } from '@discoveryjs/json-ext';
import zlib from 'zlib';
export async function safeStringify({
content,
gzipCompress,
}: {
content: any;
gzipCompress?: boolean;
}) {
let json: Readable | null = stringifyStream(content);
if (gzipCompress && json !== null) {
json = json.pipe(zlib.createGzip());
}
return json;
}
const stringifiedStream = await safeStringify({content, gzipCompress: true})
const output = await myAwsClient.upload({
Body: stringifiedStream,
Bucket: 'my-bucket',
Key: 'my-key',
}).promise()
This works with small JSON (like {"hello": "world"}
) but is failing on my super large JSON.
I'm wondering if there's any obvious mistakes I'm making here, or things to try to avoid this null upload – even tips to help debug, since I've tried a few things to no avail
- I tried
.read(100)
'ing the output ofstringifiedStream
and gotnull
- I tried waiting for the readable to fire a read event by adding a
.on('readable', () => stringifiedStream.read(100))
and still gotnull
- Sometimes I'll get a second
'readable'
event fire where it is possible to pull data though which is kinda weird – I'll actually get up to 3, so any advice on how to poll from readable when it actually is readable would be appreciated too
- Sometimes I'll get a second
Thanks!
Update: I'm seeing now that multiple 'readable'
events get fired; in the first event, when I call .read(100)
I get a result of null
but in future events I get data – I think this might be the root cause here