14

I need a framework which will allow me to do the following:

  • Allow to dynamically define tasks (I'll read an external configuration file and create the tasks/jobs; task=spawn an external command for instance)

  • Provide a way of specifying dependencies on existing tasks (e.g. task A will be run after task B is finished)

  • Be able to run tasks in parallel in multiple processes if the execution order allows it (i.e. no task interdependencies)

  • Allow a task to depend on some external event (don't know exactly how to describe this, but some tasks finish and they will produce results after a while, like a background running job; I need to specify some of the tasks to depend on this background-job-completed event)

  • Undo/Rollback support: if one tasks fail, try to undo everything that has been executed before (I don't expect this to be implemented in any framework, but I guess it's worth to ask..)

So, obviously, this looks more or less like a build system, but I don't seem to be able to find something that will allow me to dynamically create tasks, most things I've seem already have them defined in the "Makefile".

Any ideas?

Unknown
  • 5,722
  • 5
  • 43
  • 64

3 Answers3

8

I've been doing a little more research and I've stumbled upon doit which provides the core functionality I need, without being overkill (not saying that Celery wouldn't have solved the job, but this does it better for my use case).

j13r
  • 2,576
  • 2
  • 21
  • 28
Unknown
  • 5,722
  • 5
  • 43
  • 64
1

Another option is to use make.

  • Write a Makefile manually or let a python script write it
  • use meaningful intermediate output file stages
  • Run make, which should then call out the processes. The processes would be a python (build) script with parameters that tell it which files to work on and what task to do.
  • parallel execution is supported with -j
  • it also deletes output files if tasks fail

This circumvents some of the python parallelisation problems (GIL, serialisation). Obviously only straightforward on *nix platforms.

j13r
  • 2,576
  • 2
  • 21
  • 28
  • A more modern alternative, which even comes with a Python API, might be [Ninja](https://pypi.org/project/ninja/) – Martin Cejp Apr 06 '22 at 18:01
0

AFAIK, there is no such framework in python which does exactly what you describe. So your options include either building something on your own or hack some bits of your requirements and model them using an existing tool. Which smells like celery.

  • You may have a celery task which reads a configuration file which contains some python functions' source code, then use eval or ast.literal_eval to execute them.

  • Celery provides a way to define subtasks (dependencies between tasks), so if you are aware of your dependencies, you can model them accordingly.

  • Provided that you know the execution order of your tasks you can route them to as many worker machines as you want.

  • You can periodically poll this background job's result and then start your tasks that are dependent on it.

  • Undo/Rollback: this might be tricky and depends on what you want to undo; results? state?

hymloth
  • 6,869
  • 5
  • 36
  • 47
  • 1
    I was thinking about celery, but it's kinda overkill for my needs; I don't need to do distributed task processing, my tasks consist in running some commands on the local machine and getting output. It's exactly like a build process where you compile files that have dependencies among each other. – Unknown Mar 12 '12 at 10:36
  • As for the undo part, I was thinking of something like class Task: with methods do() and undo() and I will deal with the logic, the framework would only need to be aware of that operations it did and to call undo() on them – Unknown Mar 12 '12 at 10:41
  • It seems like you must implement something on your own. You can use the multiprocess module, but it will be a pain in the ass to model your dependencies and implement rollbacks (probably storing intermediate results in a DB etc). I insist on celery. It's not overkill and it will ease many synchronization/dependency problems you might have. – hymloth Mar 12 '12 at 13:16