Run multiple jobs within perl script at the same time

Question

I have to run video encoding program, where I have different quantization parameters QP. QP goes from 0 to 51. In my perl script I iterate over this parameter and execute command line. The command line is qiven with:

TAppEncoder encoder_intra_main_rext.cfg -i BSEQ.RAW -b BSEQ_1.bin -o /dev/null -qp 1 -wdt 7811 -hgt 7911 -fr 1 -fs 0 -f 2 --InputBitDepth=16 --OutputBitDepth=16 --InternalBitDepth=16 --InputChromaFormat=400 --ConformanceMode=1 --SEIDecodedPictureHash >> BSEQ_1.txt

In each iteration I change ONLY qp. Now, in my Perl script when I execute the line above, it waits to finish and than proceed to the next iteration in the loop (e.g. qp=2).

Also perl script has been called by the top level shell script:

test.sh ---> test.pl ---> command1 with qp=1
                     ---> command2 with qp=2
                     ---> command3 with qp=3
                     ---> command4 with qp=4
                     ---> until the end of the for loop

I was wondering how to run two (or more) processes in parallel. For example, to run qp=1 and immediately after qp=2, without waiting qp=1 to finish. And than when one of those two is done (no matter if qp=1 or qp=2 finished first) to run qp=3, and so on.

So basically, I do not want to run the perl script in parallel, do not need multiple instances of the perl script. I need command within script (which is part of the loop) to be run in parallel. However, if there is other way to accomplis this, let me know.

The part of the code is below, now it runs one qp at the time. I want to run 2 in the parallel all the time, once one is done to go to the next, so all the time 2 processes are running.

I'm running scripts on linux mint. Im running it on one computer (I do not have a cluster). The idea is to tun it on a two cores.

Any idea how to accomplish that, or at least where to start? Thanks.

    $QP_end = $Configuration->{nb_QPs}-1;
    foreach $QP_index (0 .. $QP_end)
    {
      $QP = $Configuration->{QP_list}[$QP_index];
      print($QP," ");
      set_command_line(); # HERE I CHANGE THE QP TO SET NEW COMMAND LINE, AND THEN EXECUTE THE NEW COMMAND
      @RunCommand = ($command_line);
      `@RunCommand`;
    }

Why are you using Perl for this? The task would be trivial in shell script. Also, do you require the output to be collected in a particular order, or is it okay for the parallel tasks to write to the output file in whichever order they happen to finish? — tripleee, Jan 27 '19 at 14:35
Possible duplicate of [How can I run a system command in Perl asynchronously?](https://stackoverflow.com/questions/1752357/how-can-i-run-a-system-command-in-perl-asynchronously) — Stefan Becker, Jan 27 '19 at 14:37
The order does not matter. I use perl because I inherited the code from previous person working on this. The script itself has ~1300 lines (where we have to set and check many kind of configurations), and rewriting it in shell for me (newbie in this) will take a lot of time. — MilosR., Jan 27 '19 at 14:39
Possible duplicate of [How can I run a system command in Perl asynchronously?](https://stackoverflow.com/questions/1752357/how-can-i-run-a-system-command-in-perl-asynchronously) — tripleee, Jan 27 '19 at 14:41
@tripleee But i run perl script using the top level bash script ( test.sh ) From test.sh i run perl script (only once, no for loops) that runs the command I mentioned before. Does that means something to you? Can that been parallelized at the shell script? — MilosR., Jan 27 '19 at 14:51
@triplee could you explain the possible duplicate? I do not see this is the same problem. — MilosR., Jan 27 '19 at 15:11
The proposed duplicate asks about timeouts, but several of the answers explain exactly how to run things in parallel (aka asynchronously). — tripleee, Jan 27 '19 at 15:27

score 0 · Accepted Answer · answered Jan 27 '19 at 15:34

I have been using code like this for years

#!/usr/bin/env perl

use strict;
use warnings 'FATAL' => 'all';
use Cwd 'getcwd';
use feature 'say';
my $TOP_DIRECTORY = getcwd();
use autodie qw(:all);

sub execute {
    my $command = shift;
    print "Executing Command: $command\n";
    if (system($command) != 0) {
        my $fail_filename = "$TOP_DIRECTORY/$0.fail";
        open my $fh, '>', $fail_filename;
        print $fh "$command failed.\n";
        close $fh;
        print "$command failed.\n";
        die;
    }
}

use Parallel::ForkManager;
sub run_parallel {
    my $command_array_reference = shift;
    unless ((ref $command_array_reference) =~ m/ARRAY/) {
        say "run_parallel requires an array reference as input.";
        die;
    }
    my $manager = new Parallel::ForkManager(2);
    foreach my $command (@{ $command_array_reference }) {
        $manager->start and next;
        execute( $command );
        $manager->finish;
    }
    $manager->wait_all_children;#necessary after all lists
}

execute the above subroutine run_parallel with your series of commands @cmd

you can install Parallel::ForkManager from CPAN, e.g. sudo cpanm Parallel::ForkManager or many other ways.

Thanks. I've been explored Parallel::ForkManager and came to the solution. — MilosR., Jan 27 '19 at 16:51

Hannibal · Answer 2 · 2019-01-27T19:12:13.530

What about a naive approach based on fork() ?

# --- Prepare job queues ---
my @jobs = ( ['cmd01'..'cmd10'], ['cmd11'..'cmd20'] ) ;

# --- If fork returns PID means we're in the parent proc ---
# --- otherwise we're in the child proc ---
worker( fork ? $jobs[0] : $jobs[1] ) ;

# --- Worker --- 
sub worker {
    # --- Do jobs ---
    foreach my $cmd ( @{ $_[0] } ){
        # --- Do system command or die if RC > 0 ---
        die $! if system($cmd) ;
    }
}

The main concept is to split your job queue into chunks, then fork the process for getting a parallelism of 2 (or whatever you want), then each process will process its job queue.

This is a working but super sintetic example with a parallelism of 2 as your request. If you need more parallel procs you have to implement the split of your job queue depending on wanted parallelism and fork() a number of parallelism - 1 times.

Since each forked process runs in its own address space, they're agnostic one about each other. This means that, depending on your needs, you may have to implement an IPC mechanism for controlling the execution flows and dependancies, but in your case I think it's not necessary.

Run multiple jobs within perl script at the same time

2 Answers2