I am using AWS R studio to read a 35 GB csv file from S3 and perform analyses. I choose a machine with m4.4xlarge with 62 GB memory, but I keep getting the following message when reading the data before any analyses was performed: "Error: cannot allocate vector of size 33.0 Gb". The code I used is:
library("aws.s3")
Sys.setenv("AWS_ACCESS_KEY_ID" = "xxxxxxx",
"AWS_SECRET_ACCESS_KEY" = "yyyyyyy")
obj <-get_object("s3://xxx/yyy.csv")
When I used the following code,
aws.s3::s3read_using(read.csv, object=“"s3://xxx/yyyy.csv”)
The error message becomes:
the error message I got was below:
Error in curl::curl_fetch_disk(url, x$path, handle = handle) :
Failed writing body (4400 != 16360)
I am not familiar with Linux and I used Louis Aslett's AMI (http://www.louisaslett.com/RStudio_AMI/). Is there anything setting I should change? Thank you!
I suspect the question is related to the following two questions but no clear answer has been posted.
Reading large JSON files from S3 in RStudio EC2 instance (Louis Aslett's AMI)
Trouble Uploading Large Files to RStudio using Louis Aslett's AMI on EC2