3

I'm intermittently (about 20% of the time) getting an IOError exception from Celery when I attempt to retry a failed task.

Here is my task:

@task
def update_data(pk_id):
     try:
        pk = PK.objects.get(pk=pk_id)
        results = pk.get_update()
        return results
    except urllib2.HTTPError, exc:
        print "Let's retry in a few minutes."
        update_data.retry(exc=exc, countdown=600)

The exception:

[2011-10-07 11:35:53,594: ERROR/MainProcess] Task report.tasks.update_data[1babd4e3-45eb-4fa3-a497-68b67bb4a6df] raised exception: IOError()
Traceback (most recent call last):
  File "/home/prj/prj_env/lib/python2.6/site-packages/celery/execute/trace.py", line 36, in trace
    return cls(states.SUCCESS, retval=fun(*args, **kwargs))
  File "/home/prj/prj_env/lib/python2.6/site-packages/celery/app/task/__init__.py", line 232, in __call__
    return self.run(*args, **kwargs)
  File "/home/prj/prj_env/lib/python2.6/site-packages/celery/app/__init__.py", line 172, in run
    return fun(*args, **kwargs)
  File "/home/prj/prj/report/tasks.py", line 109, in update_data
    update_data.retry(exc=exc, countdown=600)
  File "/home/prj/prj_env/lib/python2.6/site-packages/celery/app/task/__init__.py", line 520, in retry
    self.name, options["task_id"], args, kwargs))
HTTPError

RabbitMQ Logs

=INFO REPORT==== 7-Oct-2011::15:35:43 ===
closing TCP connection <0.4294.17> from 10.254.122.225:59704

=WARNING REPORT==== 7-Oct-2011::15:35:43 ===
exception on TCP connection <0.4330.17> from 10.254.122.225:59715
connection_closed_abruptly

=INFO REPORT==== 7-Oct-2011::15:35:43 ===
closing TCP connection <0.4330.17> from 10.254.122.225:59715

=WARNING REPORT==== 7-Oct-2011::15:35:49 ===
exception on TCP connection <0.4313.17> from 10.254.122.225:59709
connection_closed_abruptly

=INFO REPORT==== 7-Oct-2011::15:35:49 ===
closing TCP connection <0.4313.17> from 10.254.122.225:59709

=WARNING REPORT==== 7-Oct-2011::15:35:49 ===
exception on TCP connection <0.4350.17> from 10.254.122.225:59720
connection_closed_abruptly

=INFO REPORT==== 7-Oct-2011::15:35:49 ===
closing TCP connection <0.4350.17> from 10.254.122.225:59720

=INFO REPORT==== 7-Oct-2011::15:36:22 ===
accepted TCP connection on [::]:5672 from 10.255.199.63:50526

=INFO REPORT==== 7-Oct-2011::15:36:22 ===
starting TCP connection <0.4501.17> from 10.255.199.63:50526

Any ideas why this might be happening?

Thanks!

erikcw
  • 10,787
  • 15
  • 58
  • 75
  • looks like it's raising `HTTPError`, not `IOError`? – asksol Oct 08 '11 at 11:01
  • @asksol I'm having the same issue, It's an HTTP Error, however the email notifications celery sends are: "Task some_task with id 7e1d88a9-ef58-4b30-b523-c879d9e90402 raised exception: 'IOError()'" – armonge Nov 14 '11 at 21:24

2 Answers2

0

May be save each task in database and retry them if no result is received for some time? Or may be dispatcher have it's own persistent storage? What about then if worker thread crash receiving the task or while executing it?

Retry Lost or Failed Tasks (Celery, Django and RabbitMQ)

Community
  • 1
  • 1
Synxmax
  • 2,226
  • 1
  • 22
  • 38
0

max_retries in celery is per default 3, so if the same task fails 3 times in a row (i.e. 20% of the time), retry will rethrow the exception.

Christian Genne
  • 190
  • 3
  • 9