5

I have a partially ordered set of tasks, where for each task all of the tasks that are strictly before it in the partial order must be executed before it can be executed. I want to execute tasks which are not related (either before or after one other) concurrently to try to minimise the total execution time - but without starting a task before its dependencies are completed.

The tasks will run as (non-perl) child processes.

How should I approach solving a problem like this using Perl? What concurrency control facilities and data structures are available?

lexicalscope
  • 7,158
  • 6
  • 37
  • 57
  • A cheat: you can also write a Makefile to describe dependencies and do e.g. `make -j 4` for a max of 4 concurrent workers. – Dallaylaen Nov 25 '11 at 14:52

2 Answers2

1

I would use a hash of arrays. For each task, all its prerequisities will be mentioned in the corresponding array:

$prereq{task1} = [qw/task2 task3 task4/];

I would keep completed tasks in a different hash, and then just

my @prereq = @{ $prereq{$task} };
if (@prereq == grep exists $completed{$_}, @prereq) {
    run($task);
}
choroba
  • 231,213
  • 25
  • 204
  • 289
1

Looks like a full solution is NP-complete.

As for a partial solution, I would use some form of reference counting to determine which jobs are ready to run, Forks::Super::Job to run the background jobs and check their statuses and POSIX::pause to sleep when maximum number of jobs is spawned.

No threads are involved since you're already dealing with separate processes.

Read the first link for possible algorithms/heuristics to determine runnable jobs' priorities.

Dallaylaen
  • 5,268
  • 20
  • 34