Persistent Long Running Tasks in Celery

Question

I'm working on a Python based system, to enqueue long running tasks to workers.

The tasks originate from an outside service that generate a "token", but once they're created based on that token, they should run continuously, and stopped only when explicitly removed by code.
The task starts a WebSocket and loops on it. If the socket is closed, it reopens it. Basically, the task shouldn't reach conclusion.

My goals in architecting this solutions are:

When gracefully restarting a worker (for example to load new code), the task should be re-added to the queue, and picked up by some worker.
Same thing should happen when ungraceful shutdown happens.
2 workers shouldn't work on the same token.
Other processes may create more tasks that should be directed to the same worker that's handling a specific token. This will be resolved by sending those tasks to a queue named after the token, which the worker should start listening to after starting the token's task. I am listing this requirement as an explanation to why a task engine is even required here.
Independent servers, fast code reload, etc. - Minimal downtime per task.

All our server side is Python, and looks like Celery is the best platform for it. Are we using the right technology here? Any other architectural choices we should consider?

Thanks for your help!

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

1

According to the docs

When shutdown is initiated the worker will finish all currently executing tasks before it actually terminates, so if these tasks are important you should wait for it to finish before doing anything drastic (like sending the KILL signal).

If the worker won’t shutdown after considerate time, for example because of tasks stuck in an infinite-loop, you can use the KILL signal to force terminate the worker, but be aware that currently executing tasks will be lost (unless the tasks have the acks_late option set).

You may get something like what you want by using retry or acks_late

Overall I reckon you'll need to implement some extra application-side job control, plus, maybe, a lock service.

But, yes, overall you can do this with celery. Whether there are better technologies... that's out of the scope of this site.

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 24 '15 at 10:46

scytale

12,346
3
32
46

I'm using on the `acks_late=True, track_started=True` on the "process_token" task. And I'll have the safeguards in place in the application level to make sure there won't be 2 tasks on the same token. – yulkes Aug 24 '15 at 15:17
With the acks_late and track_started, I'm getting to a point where the process_token task reaches "failure" state, with reason `WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM).',) ` That's not too bad, actually, because it's still in the queue. Not I just need another worker to pick it up. – yulkes Aug 24 '15 at 15:19

Persistent Long Running Tasks in Celery

1 Answers1