13

I have been trying to upload something small to s3 all day today. About 20k files in 500 directories that total about 3GB. Something absolutely reasonable for a service called Simple Storage Service. I can upload to different places on average at about 500k/s - 1mb/s (between 1.8 and 3.6 gb/h). I have been trying to upload these files to s3 all day, I must have uploaded at dismal rate on aggregate (think about 100 mb/h or something).

I have tried:

  • the s3 web console with a variety of browsers on a variety of OS
  • boto using a variety of scripts I've written, and found online (mainly here on SO).

My problems, which I was hoping you would be so kind to help me diagnose are the following:

  • dragging and dropping to the s3 console (just for it to count the 20k files, takes like an hour). why? unless I can solve this the web console is mostly useless to me.
  • the upload itself is extremely slow, seldomly faster than 100 k/s.
  • after all day uploading I noticed a simple problem with the filenames, not wanting to spend all night uploading again, I used this script: Amazon S3 boto: How do you rename a file in a bucket? which everyone claims works really fast. It manages to rename about 1 200kb file every 2-3 seconds. why?
  • after uploading, making the all the files public (using the web console) has taken like 4 hours and still it has not finished.

It is really frustrating, there must be something I am doing wrong. I expect everything to work about 10x faster and it doesn't. I've read that if split the file s3 runs faster and I've read that the zone (I'm in NYC) is really important. What change will give me the biggest increase in upload speed?

Community
  • 1
  • 1
carlosdc
  • 12,022
  • 4
  • 45
  • 62
  • You should look into [IOPS EBS volumes](http://aws.amazon.com/ebs/details/) – Nir Alfasi Jul 22 '14 at 01:04
  • @alfasin that is not relevant to S3. – Michael - sqlbot Jul 22 '14 at 01:12
  • 1
    @carlosdc, in which region did you create the bucket? Also, there have been some long-standing allegations of "unexplained" throttling of AWS traffic by one of the major broadband (and, ahem, video) providers in the northeastern US. S3, however, is anything but slow. – Michael - sqlbot Jul 22 '14 at 01:20
  • @Michael-sqlbot I created the bucket a while back and it was US Standard. My ISP is TWC. – carlosdc Jul 22 '14 at 01:27
  • @Michael-sqlbot S3 was not designed for high throughput. IOPS EBS was designed for intensive I/O - which is the OP's requirement. – Nir Alfasi Jul 22 '14 at 01:29
  • What OP describes isn't even approaching high throughput. S3 transfer speeds, for me, in excess of 10-20 mbits/sec on large uploads and downloads are not uncommon, and thousands of PUT/HEAD/GET requests per hour from a single (multithreaded) client are also not uncommon on S3. It sounds like an issue with connectivity, or even DNS is the slowdown here, not the natute of S3's design. – Michael - sqlbot Jul 22 '14 at 01:45
  • @carlosdc You mentioned `boto`, which version used? – emesday Jul 22 '14 at 01:50
  • @mskimm: I'm using version 2.29.1 – carlosdc Jul 22 '14 at 01:57
  • Have you tried `s3cmd`? with `--acl-public`. Uploading/Coping ... of small files has overhead. The purpose of the files is not served independently, I recommend to archive it first then upload. – emesday Jul 22 '14 at 02:04

1 Answers1

4

Maybe the slow upload connection can be fixed with a change of the AWS server location

I just figured out, what the problem was in my case: duration of upload (size 35MB)

  • Oregon, US us-west-2: 5-6mins
  • Frankfurth, Germany eu-central: 1mins! (that's about max of my connection)

I'm based in Vienna, not in the US -> check your AWS server location

electrobabe
  • 1,549
  • 18
  • 17