14

I have been using s3boto's S3BotoStorage as my static files backend and syncing files to my aws s3 buckets (staging and production) using ./manage.py collectstatic. It works fine. However it is painfully slow. In addition to my own static files (just a few) and django admin, I have a few third party packages with many many static files (grappelli, django-redactor). And collectstatic can take upwards of 15 minutes each time I run it, depending on my internet connection. For instances where I'm syncing with my staging bucket and things aren't quite right, and I have to tweak something and re-sync, its a big time killer. Are there any good, fast, scriptable alternatives for syncing static files to s3?

B Robster
  • 40,605
  • 21
  • 89
  • 122
  • just found a very related question: http://stackoverflow.com/questions/6618013/django-staticfiles-and-amazon-s3-how-to-detect-modified-files – B Robster May 29 '13 at 17:57

2 Answers2

49

I wrote a pluggable Django app, based on a djangosnippet, that caches the ETag of the remote file and compares the chached checksum instead of performing a lookup every time. It took me from about 1m30s to around 10s per call to manage.py collectstatic for a few hundred static files. Check it out here: https://github.com/antonagestam/collectfast

Community
  • 1
  • 1
antonagestam
  • 4,532
  • 3
  • 32
  • 44
  • 2
    Just saw this. Tested it out and went from 10-15 minutes to 2:45. Going to set up a bounty for you, since this is going to be a big productivity boost. Thanks for taking the time to answer and to create the collectfast plugin! Awesome. – B Robster Jul 20 '13 at 16:47
  • Really glad I could help you out @BenRoberts! Thanks for the contributions. – antonagestam Jul 22 '13 at 08:40
  • 1
    with gunicorn, it throws me an error: [Errno 104] Connection reset by peer – avances123 Oct 19 '13 at 13:21
  • @avances123 What do you mean? Can't see how calling management commands have anything with gunicorn to do? – antonagestam Oct 19 '13 at 23:16
  • yea sorry, its slow and for me at least(normal django app in ec2) unusable. File uploads are interrupted too because its too slow also – avances123 Oct 20 '13 at 17:44
  • @avances123 It sounds like you're having other problems. File transfers between EC2 and S3 should be fast and rarely timeout in my experience. You should still be able to gain performance from using collectfast though. – antonagestam Oct 21 '13 at 08:39
  • @antonagestam Thanks for creating the collectfast plugin. My third party plugin's static folders are HUGE in numbers. It takes me hours to upload them from my laptop. Now thanks to your plugin. collectstatic only reuploads my files and not other plugins. Thanks for the plugin. – Ishan Jun 18 '16 at 13:06
6

Set AWS_PRELOAD_METADATA to True in your settings so it pre-loads all files on s3 before syncing and only syncs the ones that are not already there (or have changed).

ojii
  • 4,729
  • 2
  • 23
  • 34