0

If I write a celery task that calls other celery tasks, can I release the parent task/worker without waiting for the downstream tasks to finish?

The situation: I am working with an API that returns some data and the arguments for the next API call. I want to put all the data behind the API into a database. My current method is to query the API for the batch to work on, start some downstream processors, then recursively re-call the API+processing chain. I fear this will lock up workers waiting for all the recursive API calls to finish, when the workers do not care about the results of their children.

Pseudocode:

@task
def apiPing(start=None):
    """ Returns a dict of 5 elements, starting at the *start* element, or the 
    beginning of the list if start is not specified.  Also present in the dict is 'remaining',
    indicating how many elements are left in the API's list"""
    return json.loads(api(start))

@task
def processList(data)
    """ Takes a result from API ping, starts a task to store each element and a 
    chain to recall the API and process that."""
    for element in data:
        store(element).delay()

    if data['remaining']!=0:
        chain = chain(apiPing.s(data['last']), processList.s())
        chain.delay()

I understand from here that the above is very close to bad; I do not want workers handling processList() to be locked up until all of the data in the API is handled. Is there a way to start the downstream tasks and release the parent worker, or refactor the above to not lock up workers?

Testing reveals that workers are in fact locked this way:

from celery import task
from time import sleep

@task
def parent():
    print "In parent"
    child.apply_async()
    print "Out of parent"

@task
def child():
    print "In child"
    sleep(10)
    print "Out of child"

[2013-08-05 18:37:29,264: WARNING/PoolWorker-4] In parent
[2013-08-05 18:37:31,278: WARNING/PoolWorker-2] In child
[2013-08-05 18:37:41,285: WARNING/PoolWorker-2] Out of child
[2013-08-05 18:37:41,298: WARNING/PoolWorker-4] Out of parent
Community
  • 1
  • 1
bwarren2
  • 1,347
  • 1
  • 18
  • 36
  • Catching the result of my call to parent() above was what was keeping things open. Invoking parent.s().delay() and not catching the result caused the parent task to close in 3/100 of a second rather than waiting for child to finish. – bwarren2 Aug 06 '13 at 17:58
  • Maybe you should follow http://docs.celeryproject.org/en/latest/userguide/canvas.html#chains best practice? – adam Dec 05 '13 at 16:13

0 Answers0