5

We run a service whereby we provide the 'work' of a web (PHP)-based app and the images, JS, CSS etc are hosted on the clients' own Amazon S3 accounts.

This is so that they get a consolidated bill for their S3 usage (The app makes use of S3 itself) and we don't have to subsidise their bandwidth (There is no monthly charge, so as it grows it would just cost us more and more money).

We have over 1000 customers right now and pushing an update to them (An update to the JS, for instance) takes an incredibly long time and this number will grow exponentially over the coming months.

I had considered a source S3 bucket owned by us, and then doing COPY requests instead of uploads between our S3 bucket and theirs. This would still take time, but be MUCH faster than the upload right now. However, I have heard there is no way to copy between two wholly separate S3 accounts without using a go-between server (Which obviously defeats the object and would actually double the time).

Is that true? Can anyone think of an alternative method for doing this?

Marc Fowler
  • 163
  • 3

1 Answers1

1

That's a really good question.

Last time I checked even COPY between different regions did not work. I know Cloudberry's Explorer app has a feature to copy data between S3 accounts, can you do a test with it? I haven't tried it, it's Windows-only.

I guess if it works, it's a matter of trying the API.

Are all your customers in the same region? Because if COPY between accounts didn't work, I'd boot an instance (or multiple) to speed up the process. If everyone is in the same region, no bandwidth charges should apply.

This is not ideal, but I guess with multiple instances you could get a lot of work done for less than 10 bucks. And it should be possible to automate the setup too.

Update

So to elaborate on EC2. An EC2 instance is just like another server. I suggested it originally so you can download the file once and upload it to other S3 accounts within AWS so you save money on bandwidth (it's free if the bucket and the instance are in the same region).

Anyway, an EC2 instance being like a server, it would require a little bit setup to bootstrap it. E.g. a custom AMI, or any AMI and some user-data (shell script passed to the instance and executed on first boot). You'd probably need to install PHP, the Amazon SDK etc. -- all of which can be automated.

The thing is, I'm not entirely sure if this is necessary in your case.

Check out the following example code: http://docs.amazonwebservices.com/AmazonS3/latest/dev/index.html?CopyingObjectUsingPHP.html

It shows how to copy data from one bucket to another. Since bucket names are unique across all S3 this shouldn't be a problem. I think all you'd need to do is give read on the files to everyone on your own AWS account (at least temporarily) and then loop through and copy the files to your customers' AWS accounts.

I think you can issue that code anywhere and don't have to worry about bandwidth charges since COPY should be all internal. No download of the actual file required.

Not sure if you reviewed the documentation, but it seems that AWS requires some sort of origin header (x-amz-copy-source) and then it takes care of the rest.

HTH

Till
  • 1,019
  • 6
  • 14
  • Theirs does it, absolutely, and that works fine. But I can't automate 1000+ accounts surely. Cost isn't an issue to be truthful - if it costs us $50 a deployment but that's it then it's worth it because the time it takes to do it currently is exponentially higher! I got an example from their API but I'm not sure of the language etc and not had elaboration yet. Any other ideas? What do you mean re. multiple instances? Of what? EC2 or something? –  May 28 '11 at 13:41
  • Can you share the example? I'm not sure what you're referring to. And yeah, when I said _instances_, I meant EC2 instances to e.g. download the file and re-upload it, etc.. I suggested EC2 because traffic between an EC2 instance and S3 (in the same region) is free. – Till May 28 '11 at 23:30
  • Is there an example of uploading a file from EC2 to Amazon that you know of? I've never used EC2 before at all. Here is the thread in question: https://forums.aws.amazon.com/thread.jspa?threadID=67785&tstart=0 Specifically "All you need to do is to use PUT, set in Content-Length header to 0 (instead of the file size) and specify headers: x-amz-metadata-directive: COPY x-amz-copy-source: /bucketName/folderPath/SourceFileName.ext <-- note that this one also should be HTTP encoded" –  May 29 '11 at 10:11
  • extended my answer – Till May 29 '11 at 14:32
  • The COPY request should only require READ access to the source object and appropriate create/write access to the bucket and/or destination object. As such, you don't need to grant access to everyone - just to the account doing the copy. – bdonlan May 29 '11 at 14:36
  • Also note that there is no hard limit to how many COPY requests can be issued simultaneously (although there may be limits to how fast you can ramp up) so with appropiate exponential-backoff-retry logic you could issue a very large number of COPY requests simultaneously from the same program. – bdonlan May 29 '11 at 14:37
  • Yeah, I didn't want to complicate it for him. I think he can assign granular read access to all his customers' AWS accounts. – Till May 29 '11 at 14:40
  • Or assign write access to the customer accounts to a central copy account – bdonlan May 29 '11 at 18:10
  • If the source files are publicly readable (i.e. by everyone, which they currently are anyway) then I can just do a standard COPY using my bucket as the source, connected to the customer account which obviously has write permissions to its own bucket? That's AWESOME! I'll definitely check this out in the next couple of days and come back with what happens! –  May 30 '11 at 09:53
  • Tried this, and I get 'Access Denied'. The entire source bucket, in Account #1, is set to public and so is the individual file. The exact same code works when specifying a target bucket owned by the same credentials :( –  May 30 '11 at 10:59
  • Nope! I was wrong. It DOES work! I forgot to change the code a bit from testing. That's awesome - thanks! –  May 30 '11 at 11:10
  • Accepted the above answer and huge thanks to everyone for general discussion etc. –  May 30 '11 at 11:11