1

I have a code, which is working as expected. But I have difficulty in storing the output of each of the executed command in $result = $ssh->capture($command_to_execute);. This uses Parallel::ForkManager module to run the commands in different Hosts by using different files as an input parameters.

Once the command execution is done, I want the resulted output should be stored in $result variable. It should append each of the hosts results in same variable and at the end I want to process the values which are in $result. I am using .= to append the resulted data to $result but it doent seems to be working.

Pasting my code here for reference:

.
.
.
my $result;
my $pm = Parallel::ForkManager->new(5);

DATA_LOOP:
foreach my $n (1..$num_buckets) {
        my $pid = $pm->start and next DATA_LOOP;

        $command_to_execute = $my_command." ".$Files{$n};
        my $ssh = SSH_Connection( $list_of_ips[$n-1], 'username', 'pass' );
        $result = $ssh->capture($command_to_execute);
        $result .= "Result from File:$Files{$n} and Host:$list_of_ips[$n-1] is $result\n"; 
        print "Result: INSIDE: $result";
        $pm->finish;
}
$pm->wait_all_children;
print "Result: OUTSIDE: $result";
print "Done\n";

sub SSH_Connection {
    my ( $host, $user, $passwd ) = @_;
    my $ssh = Net::OpenSSH->new($host,
                                user => $user,
                                password => $passwd,
                                master_opts => [-o => "StrictHostKeyChecking=no"]
    );
    $ssh->error and die "Couldn't establish SSH connection: ". $ssh->error;

    return $ssh;
}

print "Result: INSIDE: $result"; Could able to print result one by one. But print "Result: OUTSIDE: $result"; is empty, which should actually have the combined results of $results which has been taken from inside the for loop.

vkk05
  • 3,137
  • 11
  • 25
  • The outside `$result` is referring to the parent process's `$result` not the child `$result`. I think you need to pass the result back to the parent using [`run_on_finish()`](https://metacpan.org/pod/Parallel::ForkManager#RETRIEVING-DATASTRUCTURES-from-child-processes) – Håkon Hægland Dec 27 '19 at 17:42
  • See [this post](https://stackoverflow.com/a/41891334/4653379) for an answer (for example) – zdim Dec 27 '19 at 17:58

2 Answers2

5

As shown in the documentation of Parallel::ForkManager, to get a result from a child, you need to supply a reference to the result as another parameter to finish.

$pm->finish(0, [$Files{$n}, $list_of_ips[$n-1], $result]);

Use run_on_finish to gather the results:

my $result;
$pm->run_on_finish( sub {
    my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $single_result) = @_;
    $result .= "Result from File: $single_result->[0] and Host: $single_result->[1]"
             . " is $single_result->[2]\n"; 
choroba
  • 231,213
  • 25
  • 204
  • 289
2

Each time you run $pm->start, you are forking a new process to run the code until $pm->finish. This forked process cannot affect the parent process in any way except by the mechanism Parallel::ForkManager provides to send data back to the parent. This mechanism is described at https://metacpan.org/pod/Parallel::ForkManager#RETRIEVING-DATASTRUCTURES-from-child-processes.

$pm->run_on_finish(sub {
  my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $data) = @_;
  my $result = $$data;
  ...
});

DATA_LOOP:
foreach my $n (1..$num_buckets) {
        my $pid = $pm->start and next DATA_LOOP;
        ...
        $pm->finish(0, \$result);
}

Forking is in fact not needed for these operations if you are willing to restructure a bit. Net::OpenSSH can provide commands that can be managed simultaneously by an event loop such as IO::Async::Loop, thus all Perl operations will occur in the same process (but not necessarily in the order they appear). Since IO::Async::Loop->run_process returns a Future, Future::Utils provides a way to manage concurrency of these commands.

use strict;
use warnings;
use Net::OpenSSH;
use IO::Async::Loop;
use Future::Utils 'fmap_concat';

my $loop = IO::Async::Loop->new;

my $future = fmap_concat {
  my $n = shift;
  ...
  my $remote_command = $ssh->make_remote_command($command_to_execute);
  return $loop->run_process(command => $remote_command, capture => ['stdout'])
    ->transform(done => sub { "Result from File:$Files{$n} and Host:$list_of_ips[$n-1] is $_[0]\n"; });
} foreach => [1..$num_buckets], concurrent => 5;

my @results = $future->get;

There is a lot of flexibility to how the individual and overall (returned by fmap) Futures are managed, but by default any failure to execute a process will result in the whole Future failing immediately (causing get to throw an exception) and any nonzero exit will be ignored.

Grinnz
  • 9,093
  • 11
  • 18