4

I'm working on a project that will have multiple celery workers on machines in different locations in the US that will communicate over the internet.

Am I better off distributing my Django project to each machine and configuring them with the database credentials to my database host, or should I have a "main" Django/database host that presents a REST API for remote celery tasks and workers to hit for the database access?

Mostly looking for pros/cons and any factors I haven't thought of.

I can provide single simple API endpoints that provide all the data my tasks need to query and simple POST API endpoints that can create all the database entries my tasks need to create.

I'll have no more than say 10 remote workers that would maybe altogether being doing 1 request per minute.

I'm thinking this probably means my concerns are not so much about the request/response overhead but more about maintainability, architecture, security...

Dustin Wyatt
  • 4,046
  • 5
  • 31
  • 60

1 Answers1

3

The answer depends on too many variables, concerns, forces and whatnots to be anything else than "it depends...".

I assume you already thought of the following pros and cons, but anyway:

Using an API will make for longer request/response cycles (obviously) and put quite some load on the Django project (front server, app server etc). Also it means that your tasks won't be able to use all of the database features (complex queries, aggregations, whatever).

OTHO adding an API layer will isolate the workers from the inner db schema, which can make migrations (on the Django side) and deployment easier as you won't have to stop all the workers, deploy to everyone and restart the workers. Well it even makes it possible to change the API side technology without impacting the workers (not that I see much reason to do so but anyway...). But it also mean you have a whole API to maintain, and chances are that model changes - or at last part of them - will impact your API and/or your tasks code anyway (if the changes are about adding features that the workers should use etc).

IOW, it really depends (yeah, I already said so, didn't I ?) on your project's needs and constraints, and only you/your team know which solution will best match your project.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • This answer is good as I'm just looking for pros/cons of various methods and anything I haven't thought of. – Dustin Wyatt Jan 05 '17 at 18:33
  • 1
    @DustinWyatt glad if it helped in any way - but you should perhaps tell more about what you already thought of, as well as about your project itself - not necessarily what the project is about, but more on what kind of processing happens in your tasks etc. – bruno desthuilliers Jan 05 '17 at 18:38