I'm part way through uploading about 200,000 files (each is ~1MB max) to an S3 bucket from an EC2 instance (both in Europe West).
From monitoring the EC2 with CloudWatch (looking at the NetworkOut
metric), there seems to be a drop-off in the upload transfer over time:
I'm uploading the files in several tranches and the drop-off seems consistent, usually after four or five hours (but it sometimes occurs more quickly).
The files are uploaded with a Python script, which:
- Downloads a .zip from a third party server
- Extracts about 25 files from the .zip and gzips each file
- Uploads the .gzip files to the bucket
I've tried two ways of uploading the .gzip files...
- Sequentially, using boto3:
boto3.client("s3").upload_file(file.gz, bucket, file.gz)
- Running the AWS CLI as a subprocess to upload 25 .gzip files at a time
...But I saw the same drop-off with each method.
What could be causing this? Or what information should I collect to debug it?
Edit
Here's a chart for the same period, showing the BurstBalance
metric (the EC2 instance is a t2.small):
Here's CPUCreditBalance
: