2

I am doing data migration which deals with images/videos and such being downloaded and then sent to dropbox by using its api.

I'm using python-django for the entire web app but I imagine this will take a lot of bandwidth and there might be lot of issues where a failure of one image being saved shouldn't stop the entire migration.

Thus, is celery a good idea? Or Twisted?

I'm a bit confused about how this would help me. What I've in mind is to spawn a server/thread for the process of dealing with a single image or a small set of images and thus being able to do it on multiple threads.

Hick
  • 35,524
  • 46
  • 151
  • 243

1 Answers1

1

The short answer to your question "is Celery a good idea?" is "Yes". I've used Celery to achieve a similar process whereby user submission of a form initiates, amongst other things, asynchronous calls to the Twitter API which then write back to saved objects in my database. I've found Celery outstanding for this task (no pun intended).

Celery would allow you to initiate pre-defined tasks (which, in part, can be thought of as "normal" Python functions with a @task decorator added to them), each time a user indicates they'd like to download an image or images. Celery gives you granular, per-task control over errors and retries, and tasks can be submitted singly or as chains, chords, or groups, all of which means you can definitely achieve your requirement of migration continuing even when a single image fails to download.

I would recommend spending some time with the Celery tutorial here and the Celery-Django tutorial here, which will give you an introduction to the basic work flow with Celery and Django.

I can't speak to the merits of Twisted, but if you are looking for opinions on the relative strengths and weaknesses of each, these look like a good start:

Community
  • 1
  • 1
Benjamin White
  • 779
  • 6
  • 25