11

At first glance I liked very much the "Batches" feature in Celery because I need to group an amount of IDs before calling an API (otherwise I may be kicked out).

Unfortunately, when testing a little bit, batch tasks don't seem to play well with the rest of the Canvas primitives, in this case, chains. For example:

@a.task(base=Batches, flush_every=10, flush_interval=5)
def get_price(requests):
    for request in requests:
        a.backend.mark_as_done(request.id, 42, request=request)
        print "filter_by_price " + str([r.args[0] for r in requests])

@a.task
def completed():
    print("complete")

So, with this simple workflow:

chain(get_price.s("ID_1"), completed.si()).delay()

I see this output:

[2015-07-11 16:16:20,348: INFO/MainProcess] Connected to redis://localhost:6379/0
[2015-07-11 16:16:20,376: INFO/MainProcess] mingle: searching for neighbors
[2015-07-11 16:16:21,406: INFO/MainProcess] mingle: all alone
[2015-07-11 16:16:21,449: WARNING/MainProcess] celery@ultra ready.
[2015-07-11 16:16:34,093: WARNING/Worker-4] filter_by_price ['ID_1']

After 5 seconds, filter_by_price() gets triggered just like expected. The problem is that completed() never gets invoked.

Any ideas of what could be going on here? If not using batches, what could be a decent approach to solve this problem?

PS: I have set CELERYD_PREFETCH_MULTIPLIER=0 like the docs say.

Chillar Anand
  • 27,936
  • 9
  • 119
  • 136
ninja.user
  • 382
  • 3
  • 10
  • Just for the record, I needed so bad the batching thing that I ended up using RabbitMQ + Pika alone with a very simple worker template that buffers messages. If anyone interested, I have the source code available, cheers. – ninja.user Aug 02 '15 at 23:04

1 Answers1

5

Looks like the behaviour of batch tasks is significantly different from normal tasks. Batch tasks are not even emitting signals like task_success.

Since you need to call completed task after get_price, You can call it directly from get_price itself.

@a.task(base=Batches, flush_every=10, flush_interval=5)
def get_price(requests):
    for request in requests:
         # do something
    completed.delay()
Chillar Anand
  • 27,936
  • 9
  • 119
  • 136