39

I have 2 kind of tasks : Type1 - A few of high priority small tasks. Type2 - Lot of heavy tasks with lower priority.

Initially i had simple configuration with default routing, no routing keys were used. It was not sufficient - sometimes all workers were busy with Type2 Tasks, so Task1 were delayed. I've added routing keys:

CELERY_DEFAULT_QUEUE = "default"
CELERY_QUEUES = {
    "default": {
        "binding_key": "task.#",
    },
    "highs": {
        "binding_key": "starter.#",
    },
}
CELERY_DEFAULT_EXCHANGE = "tasks"
CELERY_DEFAULT_EXCHANGE_TYPE = "topic"
CELERY_DEFAULT_ROUTING_KEY = "task.default"

CELERY_ROUTES = {
        "search.starter.start": {
            "queue": "highs",
            "routing_key": "starter.starter",
        },
}

So now i have 2 queues - with high and low priority tasks.

Problem is - how to start 2 celeryd's with different concurrency settings?

Previously celery was used in daemon mode(according to this), so only start of /etc/init.d/celeryd start was required, but now i have to run 2 different celeryds with different queues and concurrency. How can i do it?

Andrew
  • 3,165
  • 4
  • 24
  • 29

4 Answers4

39

Based on the above answer, I formulated the following /etc/default/celeryd file (originally based on the configuration described in the docs here: http://ask.github.com/celery/cookbook/daemonizing.html) which works for running two celery workers on the same machine, each worker servicing a different queue (in this case the queue names are "default" and "important").

Basically this answer is just an extension of the previous answer in that it simply shows how to do the same thing, but for celery in daemon mode. Please note that we are using django-celery here:

CELERYD_NODES="w1 w2"

# Where to chdir at start.
CELERYD_CHDIR="/home/peedee/projects/myproject/myproject"

# Python interpreter from environment.
#ENV_PYTHON="$CELERYD_CHDIR/env/bin/python"
ENV_PYTHON="/home/peedee/projects/myproject/myproject-env/bin/python"

# How to call "manage.py celeryd_multi"
CELERYD_MULTI="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryd_multi"

# How to call "manage.py celeryctl"
CELERYCTL="$ENV_PYTHON $CELERYD_CHDIR/manage.py celeryctl"

# Extra arguments to celeryd
# Longest task: 10 hrs (as of writing this, the UpdateQuanitites task takes 5.5 hrs)
CELERYD_OPTS="-Q:w1 default -c:w1 2 -Q:w2 important -c:w2 2 --time-limit=36000 -E"

# Name of the celery config module.
CELERY_CONFIG_MODULE="celeryconfig"

# %n will be replaced with the nodename.
CELERYD_LOG_FILE="/var/log/celery/celeryd.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"

# Name of the projects settings module.
export DJANGO_SETTINGS_MODULE="settings"

# celerycam configuration
CELERYEV_CAM="djcelery.snapshot.Camera"
CELERYEV="$ENV_PYTHON $CELERYD_CHDIR/manage.py celerycam"
CELERYEV_LOG_FILE="/var/log/celery/celerycam.log"

# Where to chdir at start.
CELERYBEAT_CHDIR="/home/peedee/projects/cottonon/cottonon"

# Path to celerybeat
CELERYBEAT="$ENV_PYTHON $CELERYBEAT_CHDIR/manage.py celerybeat"

# Extra arguments to celerybeat.  This is a file that will get
# created for scheduled tasks.  It's generated automatically
# when Celerybeat starts.
CELERYBEAT_OPTS="--schedule=/var/run/celerybeat-schedule"

# Log level. Can be one of DEBUG, INFO, WARNING, ERROR or CRITICAL.
CELERYBEAT_LOG_LEVEL="INFO"

# Log file locations
CELERYBEAT_LOGFILE="/var/log/celerybeat.log"
CELERYBEAT_PIDFILE="/var/run/celerybeat.pid"
eedeep
  • 12,183
  • 3
  • 18
  • 14
  • You just saved my life, in the whole web it's so difficult to find full proper configuration of this beast. You need to create a blog article, thousands would be happy! – holms Jan 29 '15 at 04:46
  • replace http://ask.github.com/celery/cookbook/daemonizing.html with http://ask.github.io/celery/cookbook/daemonizing.html please because existing one domain is depreciated & not working. – Ashok Joshi Nov 16 '21 at 05:25
35

It seems answer - celery-multi - is currently not documented well.

What I needed can be done by the following command:

celeryd-multi start 2 -Q:1 default -Q:2 starters -c:1 5 -c:2 3 --loglevel=INFO --pidfile=/var/run/celery/${USER}%n.pid --logfile=/var/log/celeryd.${USER}%n.log

What we do is starting 2 workers, which are listening to different queues (-Q:1 is default, Q:2 is starters ) with different concurrencies -c:1 5 -c:2 3

Carlos Peña
  • 224
  • 2
  • 10
Andrew
  • 3,165
  • 4
  • 24
  • 29
11

Another alternative is to give the worker process a unique name -- using the -n argument.

I have two Pyramid apps running on the same physical hardware, each with its own celery instance(within their own virtualenvs).

They both have Supervisor controlling both of them, both with a unique supervisord.conf file.

app1:

[program:celery]                                            
autorestart=true                                            
command=%(here)s/../bin/celery worker -n ${HOST}.app1--app=app1.queue -l debug
directory=%(here)s     

[2013-12-27 10:36:24,084: WARNING/MainProcess] celery@maz.local.app1 ready.

app2:

[program:celery]                                 
autorestart=true                                 
command=%(here)s/../bin/celery worker -n ${HOST}.app2 --app=app2.queue -l debug
directory=%(here)s                               

[2013-12-27 10:35:20,037: WARNING/MainProcess] celery@maz.local.app2 ready.
maz
  • 8,056
  • 4
  • 26
  • 25
  • I'd like to do something similar where I have multiple worker instances. I've tried to read the routing tasks page, but don't quite understand. How do you route tasks to each particular worker? – Raj Jan 21 '14 at 22:16
  • The tasks automatically route to the correct worker because the workers for 'maz' are in one virtual environment and the workers for 'clockworkelves' are in another virtual environment. – maz Jan 22 '14 at 22:42
  • I'm assuming this setup could work if I want both instances to pull from the same queue? – ChrisC Feb 03 '14 at 22:55
  • I'm not sure. Though if you take a look at my github project. You can set up the nginx to have two virtual hosts and that way you can find out for sure. https://github.com/mazzaroth/initpyr – maz Feb 04 '14 at 21:40
-1

An update:

In Celery 4.x, below would work properly:

celery multi start 2 -Q:1 celery -Q:2 starters -A $proj_name

Or if you want to designate instance's name, you could:

celery multi start name1 name2 -Q:name1 celery -Q:name2 queue_name -A $proj_name

However, I find it would not print details logs on screen then if we use celery multi since it seems only a script shortcut to boot up these instances.

I guess it would also work if we start these instances one by one manually by giving them different node names but -A the same $proj_name though it's a bit of a wasting of time.

Btw, according to the official document, you could kill all celery workers simply by:

ps auxww | grep 'celery worker' | awk '{print $2}' | xargs kill -9

Alexww
  • 1
  • 1
  • 2