6

I' ve got a bunch of tasks in Celery all connected using a canvas chain.

@shared_task(bind=True)
def my_task_A(self):
    try:
        logger.debug('running task A')
        do something
    except Exception:
        run common cleanup function

@shared_task(bind=True)
def my_task_B(self):
    try:
        logger.debug('running task B')
        do something else
    except Exception:
        run common cleanup function

...

So far so good. The problem is that I'm looking for the best practice when it comes to using a common utility function like this:

def cleanup_and_notify_user(task_data):
    logger.debug('task failed')
    send email
    delete folders
    ...

What't the best way to do that without the tasks blocking? For example can I just replace run common cleanup function with a call to cleanup_and_notify_user(task_data)? And what would happen if multiple tasks from multiple workers attempt to call that function at the same time?

Does each worker get its own copy? I apparently am a bit confused over few of the concepts here. Any help is much appreciated.

Thank you all in advance.

stratis
  • 7,750
  • 13
  • 53
  • 94
  • 1
    tasks run in their own processes so they'll be isolated enough, Python -wise. As to how concurrently modifying external resources behaves... Hard to say. May want to add your OS in the question and think about external synch/exclusion mechanisms like creating a working marker file or OS multiprocess signalling mechanisms. – JL Peyret Apr 25 '15 at 16:09
  • @JLPeyret I only care about some sort of mechanism to "coordinate" access to the fallback function once a task fails... Btw, I'm using an ubuntu machine. – stratis Apr 25 '15 at 16:11
  • Doesnt matter at the function level. Does at the what the function is modifying level. I.e func X & Y both delete folders could be a problem, regardless of them being different functions. IAnd, since you are NOT on threads here, beware of Python instructions given in a threading context. – JL Peyret Apr 25 '15 at 16:15
  • 1
    I am not that clever about locks and the like, so I dont want to induce you into errors. However, Python has an alternative concurrency mechanism called multiprocess, which is not thread based and therefore will be celery-like for your context. This search, [python] lock multiprocess, might shed some light. See also http://stackoverflow.com/questions/28670524/how-to-make-two-tasks-mutually-exclusive-in-celery which is another approach - telling Celery NOT to run certain tasks (the same in your case) at the same time - I think thats probably the easiest way. – JL Peyret Apr 25 '15 at 16:29

1 Answers1

2

From inside a celery task you are just coding python therefore the task has is own process and the function will be just instantiated for each task like in any basic OOP logic. Off course if this cleanup function attempt to delete shared resources like system folders or database rows you end up in concurrent access problem of external resources that you need to solve in other way, for example in case of fylesystem you could create a sandbox for each task. Hope this helps

Mauro Rocco
  • 4,980
  • 1
  • 26
  • 40