Celery : understanding the big picture

Question

Celery seems to be a great tool, but I have hard time understanding how the various Celery components work together:

The workers
The apps
The tasks
The message Broker (like RabbitMQ)

From what I understand, the command line:

celery -A not-clear-what-this-option-is worker

should run some sort of celery "worker server" which would itself need to connect to a broker server (I'm not so sure why so many servers are needed).

Then in any python code, some task may be sent to the worker by instantiating an app:

app = Celery('my_module', broker='pyamqp://guest@localhost//')

and then by decorating functions with this app in the following way:

@app.tasks
def my_func():
    ...

so that "my_func()" can now be called as "my_func.delay()" to be ran in an asynchronuous way.

Here are my questions:

What happens when my_func.delay() is called ? which server talks to which first ? and sending what where ?
What is the option to put behind the "-A" of the celery command? is this really needed ?
Suppose I have a process X which instantiates a Celery app to launch the task A, and suppose I have another process Y who wants to know the status of task A launched by X. I assume there is a way for Y to do so, but I don't know how. I suppose that Y should create its own instance of a Celery app. But then:
- What function to call in the celery app of Y to get this information (and what is the "identifier" of task A inside the process Y) ?
- How does this work in terms of communication, that is, when does the request goes through the Broker, and when does it go to the worker(s) ?

If anyone has some information about these questions, I would be grateful. I intend to use Celery in a Django project, where some requests to the server can trigger various time consuming tasks, and/or inquire about the status of previously launched tasks (pending, finished, error, etc...).

as this have +3 yrs, you probably have figured out your own answers, but I thought was important to leave some answer registered — Guilherme, Feb 26 '21 at 21:59

score 1 · Answer 1 · answered Feb 26 '21 at 21:53

About the broker:

The main role of the broker is to mediate communication between the client and the worker

basically a lot of information is being generated and processed while your worker is running

taking care of this information is the broker's role

e.g. you can configure redis so that no information is lost if the server is shut down while running a process

The worker:

you can think of the worker as an instance independent of your application, which will only execute those tasks that you delegate to it

About the state of a task:

there are ways to consult celery to find out the status of a task, but I would not recommend building your application logic depending on this

if you want to get the output of a process and turn it in the input of another one, using tasks, I would recommend you to use a queue

run task A, and before finish insert your result objects in the queue
task B will listen to the queue and processes whatever comes up

The command:

on the terminal you can see in more detail what each argument means by running celery -h or celery --help

but the argument basically specifies which instance of celery you intend to run. So normally this argument will indicate where the instance you have configured and intend to execute can be found

usage: celery [-h] [-A APP] [-b BROKER] [--result-backend RESULT_BACKEND]
              [--loader LOADER] [--config CONFIG] [--workdir WORKDIR]
              [--no-color] [--quiet]

I hope this can provide an initial overview for those who get here

score 1 · Answer 2 · answered Mar 01 '21 at 11:20

Celery is used to make functions to run in the background. Imagine you have a web API that does a job, and returns a response. You know, that job would seriously affect the response time for the API. So you'll transfer that particular job to Celery, and your API will respond instantly. Examples for some job that affect performance of an API are,

Routing to email servers
Routing to SMS Gateways
Database backup
Chained database operations
File conversion

Now, let's cover each components of celery.

The workers Celery workers execute the job(function). They are asynchronous. So you'll have double the number of your processor cores as celery workers. You can assign a name and task to a celery worker#.
The apps The app is the name of project you're working on. You'll have to specify that name in the celery instance.
The tasks The functions you need to be executed in the background. Every task Celery execute will have a task id, state(and more). You can get that by inspecting a particular task.
The message Broker Those tasks which will be executed in the background has to be moved from your python project to to Celery workers. Message brokers act as a medium here. So functions with its arguments will be transferred to brokers and from brokers Celery will fetch them to execute.

Some codes

celery -A project_name worker_name
celery -A project_name worker_name inspect

More in documentation docs.celeryproject.org

Celery : understanding the big picture

2 Answers2