CDN Rerouting on 404 (file not yet in synch with original storage)

Question

Here is the problem. I've setup my app(on EC2) to store uploaded images directly on Amazon S3. I'd like to be able to serve static files(cdn) from my 'home' server so I wrote script that does sync from S3. But there is a window of (at least) one minute in synch.

Now I see two solutions on the problem of pics not been available on 'home' server here: 1.I write script on EC2 (where the app resides) to fetch from DB pics that have status of "not-yet-synch", which is default state when user uploads picture. The script then does a ping to picture and if it gets OK response, updates DB from "not-yet-synch" to "synch".

2.Prefered solution would be to let apache (in this case) redirect request for an image if it sees 404 (e.g. doesent find image requested) to S3. This way I wouldn't need script from solution 1.

So what approach do you suggest I take in solving this redundancy problem? Or what is practice in production environments?

To further clarify; I'd like so serve images first from 'home' server, if that fails serve them from S3.

Tnx, Alan

score 0 · Answer 1 · answered Feb 11 '10 at 10:11

The way I handled this a few years ago (might not be the best way) was to not serve the images from Apache directly but use a php script (use .htaccess to rewrite the urls if you like).

The php script would check to see if the image exists locally, and if not would pull it from the other server (S3 in your case) and then return it to the browser - the web browser would never see that S3 is being used. Subsequent requests would then be able to just send the local copy without asking S3 for a copy. The side benefit of this is that you only store images locally that have actually been requested.

One thing you need to consider though is what happens when the S3 version of the image changes. How do you detect this?

Well deciding that on php (per image basis) seems to me bit slow, so I'd go rather with bg job updating the DB on synch success in that case so app "knows"/set instantly right url. 'Chaching' mechanism is good idea, just that I wan't to have all of the images on 'home' server (so I won't bother with this for now) As it goes with S3, image names are combined from user IDs so basicly image replacement just does overwrite + update to DB to "not-yet-synch" and the process starts all over again. — Alan Ristić, Feb 11 '10 at 13:36

CDN Rerouting on 404 (file not yet in synch with original storage)

1 Answers1