I am working on porting an application I wrote from Golang (using redis) to Python and I would love to use Celery to accomplish my task queuing, but I have a question regarding routing...
My application receives "events" via REST POSTs, where each "event" can be of a different type. I then want to have workers in the background wait for events of certain types. The caveat here is that ONE event could result in MORE than ONE task handling the event. For example:
some/lib/a/tasks.py
@task
def handle_event_typeA(a,b,c):
# handles event...
pass
@task
def handle_event_typeB(a,b,c):
# handles other event...
pass
some/lib/b/tasks.py
@task
def handle_event_typeA(a,b,c):
# handles event slightly differently... but still same event...
pass
In summary... I want to be able to run N number of workers (across X number of machines) and each one of these works will have Y numbers of tasks registered such as: a.handle_event_typeA, b.handle_event_typeA, etc... and I want to be able to insert a task into a queue and have one worker pick up the task and route it to more than one task in the worker (i.e. to both a.handle_event_typeA and b.handle_event_typeA).
I have read over the documentation of Kombu here and Celery's routing documentation here and I can't seem to figure out how to configure this correctly.
I have been using Celery for some time now for more traditional workflows and I am very happy with its feature set, performance, and stability. I would implement what I need using Kombu directly or some homebrew solution, but I would like to use Celery if at all possible.
Thanks guys! I hope I don't waste anyone's time with this question.
Edit 1
After some more time thinking about this issue I have come up with a workaround to implement what I want with Celery. It's not the most elegant solution, but it's working well. I am using django and it's caching abstraction (you can use something like memcached or redis directly instead). Here's the snippet I came up with:
from django.core.cache import cache
from celery.execute import send_task
SUBSCRIBERS_KEY = 'task_subscribers.{0}'
def subscribe_task(key, task):
# get current list of subscribers
cache_key = SUBSCRIBERS_KEY.format(key)
subscribers = cache.get(cache_key) or []
# get task name
if hasattr(task, 'delay'):
name = task.name
else:
name = task
# add to list
if not name in subscribers:
subscribers.append(name)
# set cache
cache.set(cache_key, subscribers)
def publish_task(key, *kargs):
# get current list of subscribers
cache_key = SUBSCRIBERS_KEY.format(key)
subscribers = cache.get(cache_key) or []
# iterate through all subscribers and execute task
for task in subscribers:
# send celery task
send_task(task, args=kargs, kwargs={})
What I then do is subscribe to tasks in different modules by doing the following:
subscribe_task('typeA', 'some.lib.b.tasks.handle_event_typeA')
Then I can call the publish task method when handling the REST events.