0

We are using Glue to process big data workloads. I have a requirement where in I have around 500,000 records being processed in Glue using G2.X workers. We have partitioned our S3 destination bucket with certain prefixes where this data has to be saved from Glue. I tried to do a normal boto3 put_object and got the following error:

"AmazonS3Exception: Please reduce your request rate. (Service: Amazon S3;
            Status Code: 503; Error Code: SlowDown)"

This doc from AWS mentions to use S3DistCp to resolve this error from EMR: https://aws.amazon.com/premiumsupport/knowledge-center/s3-troubleshoot-503-cross-region/#:~:text=When%20Amazon%20S3%20returns%20a,available%20for%20cross%2DRegion%20copying. But I want to know if there is a way to do this from AWS Glue to S3 using S3DistCP without decreasing the number of executors or slowing down the upload process. Note that, I have certain number of files ranging from 1000 to 70,000 to be uploaded for each prefix on an average.

Vijeth Kashyap
  • 179
  • 1
  • 11

0 Answers0