3

I need some help regarding Celery workers. I'm especially not able to understand where (which directory) does the celery worker command needs to be fired from and what is the concept behind it and some things around imports.

So say I have the following directory structure :

.
├── __init__.py
├── entry.py
├── state1
│   ├── __init__.py
│   ├── family1
│   │   ├── __init__.py
│   │   ├── task1.py
│   │   ├── task2.py
│   │   └── task3.py
│   └── family2
│       ├── __init__.py
│       └── task1.py
└── state2
    ├── __init__.py
    ├── family1
    │   ├── __init__.py
    │   ├── task1.py
    │   └── task2.py
    └── family2
        ├── __init__.py
        ├── task1.py
        └── task2.py

. at the root is the current working directory, named project

each of the taskn.py (task1.py, task2.py etc) are the individual tasks. Each task file looks something like this:

from celery import Celery
from celery.result import AsyncResult
from kombu import Queue

_name_ = "project_x"
celapp=Celery(backend='redis://localhost:6379/0', broker='amqp://a:b@localhost/a_vhost')
CELERY_CONFIG = {
    'CELERY_DEFAULT_QUEUE': 'default',
    'CELERY_QUEUES': (Queue('q1'), Queue('q2'),),
    'CELERY_TASK_SERIALIZER': 'pickle',
    'CELERY_ACCEPT_CONTENT': ['json','pickle']
}

celapp.conf.update(**CELERY_CONFIG)

@celapp.task()
def t1():
    print("starting task")
    time.sleep(5)
    print("Finished task")

The below is the content of entry.py:

import json
from flask_cors import CORS
from flask import Flask, Response, render_template
from flask import request, jsonify, redirect
from functools import wraps
<what would be the import statement to import all the tasks>

_name_ = "project_x"
app     = Flask(_name_)

@app.route("/api1", methods=['POST'])
def api1():
    req = request.jsonify
    if not req:
        return jsonify(success=False, msg="Missing request parameters", code="1")
    else:
        param1 = req.get('p1')
        param2 = req.get('p2')
        tId = startTask()
        return jsonify(success="True", msg="All Good", taskId=tId)


def startTask():
    tId = "abcd123"
    created_task = state1.family1.task1.subtask(queue='q1')
    created_task.delay()
    return tId


if __name__ == '__main__':
    app.run(debug=True, host="192.168.1.7", port="4444")

the entry.py is the flask app from where api1 would be triggered and then depending on the parameters I would want to start a specific task.

Now here are my questions:

  1. what would be the import statement to import all the tasks in the entry.py file
  2. where from do I start the worker. I mean from which directory should I start the Celery -A <directory name> worker -l info command and why ?
  3. In many examples I saw that there is a clear segregation between tasks and CeleryApp file. Could someone please suggest what would be a better way to arrange my tasks, and celery configs etc. and how would the above 2 questions align with this new proposed structure ?
qre0ct
  • 5,680
  • 10
  • 50
  • 86

2 Answers2

5

Ok, hope this might help. I will respond in reverse as you asked.

In many examples I saw that there is a clear segregation between tasks and CeleryApp file. Could someone please suggest what would be a better way to arrange my tasks, and celery configs etc. and how would the above 2 questions align with this new proposed structure ?

The first problem that i see with the snippets you add, its every taskn.py that you have, has his own instance of celery. You need to share this instance between every taskn.py. What i recommend is create a celery_app.py

my_app
├── __init__.py
├── entry.py
├── celery_app.py
│   ├── ...

In this file you will create the celery instance

from celery import Celery
from celery.result import AsyncResult
from kombu import Queue

_name_ = "project_x"
celapp=Celery(backend='redis://localhost:6379/0', broker='amqp://a:b@localhost/a_vhost')
CELERY_CONFIG = {
    'CELERY_DEFAULT_QUEUE': 'default',
    'CELERY_QUEUES': (Queue('q1'), Queue('q2'),),
    'CELERY_TASK_SERIALIZER': 'pickle',
    'CELERY_ACCEPT_CONTENT': ['json','pickle']
}

celapp.conf.update(**CELERY_CONFIG)
celery_app.conf.imports = [
    'state1.family1.task1',
    'my_app.state1.family1.task2',  # Or Maybe
    ...
]

Then in every taskn.py you can import this instance, and every task will be registered under the same celery application

from my_app.celery_app import celapp

@celapp.task()
def t1():
    print("starting task")
    time.sleep(5)
    print("Finished task")

where from do I start the worker. I mean from which directory should I start the Celery -A worker -l info command and why ?

Then you should easily call Celery -A my_app.celery_app worker -l info because your celery instance will be in the module my_app, submodule celery_app

what would be the import statement to import all the tasks in the entry.py

Finally from entry.py you can do import state1.family1.task1 import t1 and call for t1.delay() or any registered task.

Patricio
  • 403
  • 3
  • 10
  • this sounded ok, but when I tried it, I get an import error in `taskn.py` as : ```ImportError: No module named my_app.celery_app```. Any comments ? – qre0ct May 25 '19 at 06:58
  • So I could fix that import error by doing this: `from celery_app import celapp` inside `taskn.py` instead of `from my_app.celery_app import celapp`. However, now when the I try getting the task started I get the following Celery error: ```Received unregistered task of type state1.family1.t1``` and then ```KeyError: 'state1.family1.t1'``` – qre0ct May 25 '19 at 07:22
  • Now here's the thing that makes me believe I don't understand any of it clearly - When I start `Celery -A my_app.celery_app worker -l info` from the parent directory of my_app, that's when I get all the above errors as mentioned in the comments. However, running it as ```Celery -A state1.family1.task1 worker -l info``` works like a charm. And I have no clue why ! – qre0ct May 25 '19 at 07:36
  • also, if it helps, with this `my_app.celery_app worker -l info` there are no registered tasks that show up in the info produced by the above command. While with this `Celery -A state1.family1.task1 worker -l info` the task t1 shows on the console as a registered task – qre0ct May 25 '19 at 08:16
  • 1
    Ok, probably every problem is an Import Error, you should use `Celery -A my_app.celery_app worker -l info` beacuse is the Celery application, maybe you need to configure (in `celery_app.py`) the imports as `celery_app.conf.imports = ['state1.family1.task1', 'state1.family1.task2', ...]` – Patricio May 27 '19 at 16:58
3

So taking the advice from @Patricio, it seemed that it was indeed an import error. My new directory structure looks like the below:

.
├── __init__.py
├── celeryConfig
│   ├── __init__.py
│   └── celeryApp.py
├── entry.py
├── state1
│   ├── __init__.py
│   ├── family1
│   │   ├── __init__.py
│   │   ├── task1.py
│   │   ├── task2.py
│   │   └── task3.py
│   └── family2
│       ├── __init__.py
│       └── task1.py
└── state2
    ├── __init__.py
    ├── family1
    │   ├── __init__.py
    │   ├── task1.py
    │   └── task2.py
    └── family2
        ├── __init__.py
        ├── task1.py
        └── task2.py

while contents of the celeryConfig/celeryApp.py are the below:

from celery import Celery
from celery.result import AsyncResult
from kombu import Queue

_name_ = "project_x"
celapp=Celery(backend='redis://localhost:6379/0', broker='amqp://a:b@localhost/a_vhost', include=['state1.family1.task1'])
CELERY_CONFIG = {
    'CELERY_DEFAULT_QUEUE': 'default',
    'CELERY_QUEUES': (Queue('q1'), Queue('q2'),),
    'CELERY_TASK_SERIALIZER': 'pickle',
    'CELERY_ACCEPT_CONTENT': ['json','pickle']
}

celapp.conf.update(**CELERY_CONFIG)

and the contents of taskn.py is something like:

from celeryConfig.celeryApp import celapp
import time

@celapp.task()
def t1():
    print("starting task")
    time.sleep(5)
    print("Finished task")

while entry.py remains as it is, with just one change as below:

from state1.family1.task1 import t1

And now when celery is started as : celery -A celeryConfig.celeryApp worker -l info from the root directory, project, it all runs fine. As output of the above command I get the message as

.
.
.
[tasks]
  . state1.family1.task1.t1

.
.
.

indicating that the celery has started correctly and that the task has indeed been registered. So now, in order to get all the tasks registered I can read through the directory/directories and create the include list in celeryApp.py dynamically. (Will post more about it once done)

Thanks @Patricio

qre0ct
  • 5,680
  • 10
  • 50
  • 86