60

I am new to celery.I know how to install and run one server but I need to distribute the task to multiple machines. My project uses celery to assign user requests passing to a web framework to different machines and then returns the result. I read the documentation but there it doesn't mention how to set up multiple machines. What am I missing?

Noufal Ibrahim
  • 71,383
  • 13
  • 135
  • 169
Evan Gui
  • 765
  • 1
  • 5
  • 8

2 Answers2

63

My understanding is that your app will push requests into a queueing system (e.g. rabbitMQ) and then you can start any number of workers on different machines (with access to the same code as the app which submitted the task). They will pick out tasks from the message queue and then get to work on them. Once they're done, they will update the tombstone database.

The upshot of this is that you don't have to do anything special to start multiple workers. Just start them on separate identical (same source tree) machines.

The server which has the message queue need not be the same as the one with the workers and needn't be the same as the machines which submit jobs. You just need to put the location of the message queue in your celeryconfig.py and all the workers on all the machines can pick up jobs from the queue to perform tasks.

Noufal Ibrahim
  • 71,383
  • 13
  • 135
  • 169
  • You are right!But how does the main server know about all the workers? – Evan Gui Apr 21 '12 at 17:57
  • What do you mean by "main server"? – Noufal Ibrahim Apr 21 '12 at 18:07
  • 2
    @I'mTravelerClown i think your confusing a distributed environment with a master - slaves one. There's no main server in a celery based environment but many nodes with workers that do stuffs. You might need to explain your problem better. – FlaPer87 Apr 21 '12 at 19:14
  • Sorry,This is my problem.My "main server" means a node that is a queuing system and all workers set up on other machines.I think the workers need to config to know who is node. Is that right? – Evan Gui Apr 22 '12 at 02:49
  • @I'mTravelerClown you use remote control commands on workers. http://celery.readthedocs.org/en/latest/userguide/workers.html#remote-control – Marconi Oct 01 '12 at 10:27
  • @Noufal Ibrahim let me know if I am wrong, but my enviroment is: aws-webserver, aws-redis, aws-celery-1, aws-celery-2 (the aws-celery-* are running the aws-webserver code as well). if I configure the broker in aws-webserver points to aws-redis, and run python celery on aws-celery-* that points to aws-redis, there is the distribuited system what I need to use remote workers where aws-redis is who knows where to receive tasks and where to send the tasks to be done?, because I connected everything to the broker? thank you in advance. – panchicore Feb 22 '13 at 12:51
  • @Noufallbrahim Could you explain what is "same source tree" refers to? Are you saying as long as I put the location of the message queue in celeryconfig.py on each worker, they can fetch the queue automatically? Thanks – user2372074 Oct 10 '14 at 07:16
  • It's been a while so things might be different. The basic thing is that the machine that actually sends a task to be executed is often different from the machine that actually executes the task. Both of them need access to the actual code that will be run. So, the same "source tree" (i.e. code) should be available on the machine that's running celeryd and the machine which is sending tasks to the queue. – Noufal Ibrahim Oct 11 '14 at 02:24
  • The "sender" machine doesn't need access to the actual task code: http://celery.readthedocs.org/en/latest/faq.html#can-i-call-a-task-by-name – lajarre Feb 19 '15 at 18:18
  • 3
    @lajarre are you sure? Surely the worker would need to know what `tasks.add` (in the example you link to) refers to, no? – Chris Jan 14 '16 at 16:24
  • @Chris I am very late here i think. But to anyone else wondering, the workers would need to know that which is why you put the same codebase on all servers. So you have 3 servers (lets say A,B,C) with same code base. You just don't start the workers on server A. Which means, workers on server B and C will pick up tasks automatically. If you want A to do tasks as well, then start workers on A too. – Mazhar Ali Aug 14 '23 at 09:22
2

The way I deployed it is like this:

  1. clone your django project on a heroku instance (this will run the frontend)
  2. add RabitMQ as an add on and configure it
  3. clone your django project into another heroku instance (call it like worker) where you will run the celery tasks
user2471214
  • 729
  • 9
  • 17