1

I am trying to read a file of size around 1GB, from an S3 bucket. My objective is to read the data from the file and send it over to another server.

At the moment when I try to read a large file(1GB) my system hangs up/server crashes. I am able to console out the data of a 240MB file with the following segment of code

var bucketParams = {
    Bucket: "xyzBucket",
    Key: "input/something.zip"
};

router.get('/getData', function(req, res) {
    s3.getObject(bucketParams, function(err, data) {
        if (err) {
            console.log(err, err.stack); // an error occurred
        }
        else {
            console.log(data); // successful response
        }
    });
    // Send data over to another server
});

How would it work, when it comes to reading large files from S3?

RRP
  • 2,563
  • 6
  • 29
  • 52

2 Answers2

3

To answer the question of reading large files from S3, I would recommend using the Range to get a part of the object

https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html

Getting it part by part will prevent you from exceeding the limitation of your framework / RAM consumption

You can also leverage the Range support to enhance bandwidth utilization with multipart / multithreaded download

qkhanhpro
  • 4,371
  • 2
  • 33
  • 45
0

You are hitting V8's max string length limits which has been recently upped to 1GB from 512MB.

I'd bet the error you get is:

Invalid String Length

This is a non-configurable limit. Upping --max_old_space_size has no effect on it.

You should look into downloading, processing and sending the processed file as a stream to the other server.

nicholaswmin
  • 21,686
  • 15
  • 91
  • 167
  • thank you!!, i did come across that error. Would it be possible for you to show a code snippet/pseudocode of how the data could be piped over to another server. All the examples seem to pipe it to another file. – RRP Mar 07 '18 at 02:10
  • Just use `res.write` for each chunk you process. – nicholaswmin Mar 07 '18 at 02:11