operating on AoAs stored in hash. PDL vs no PDL

Question

I have a hash of AoAs:

$hash{$key} = [ 
               [0.0,1.0,2.0],
               10.0,
               [1.5,9.5,5.5],
              ];

that I need to crunch as follows:

$err += (($hash{$key}[0][$_]-$hash{key}[2][$_])*$hash{$key}[1])**2 foreach (0 .. 2);

calculating the squared weighted difference between the two arrays. Since my hash is large, I was hoping PDL would help speed up the calculation, but it doesn't for some reason. I'm still new to PDL so I'm probably messing something up. the script below with PDL is ~10 times slower. Description: The following two scripts are my attempt to represent, simply, what is going of in my program. I read in some reference values into the hash, and then I compare observations (pulled into the hash on the fly) to those values a bunch of times with some weight. In the scripts, I set the reference array, weight, and observation array to some arbitrary fixed values, but that won't be the case at run time.

here are two simple scripts without and with PDL:

without PDL

use strict;
use warnings;
use Time::HiRes qw(time);

my $t1 = time;
my %hash;
my $error = 0;

foreach (0 .. 10000){
  $hash{$_} = [
               [0.000, 1.000, 2.0000],
               10.0,
               [1.5,9.5,5.5],
              ];
  foreach my $i (0 .. 2){
    $error += (($hash{$_}[0][$i]-$hash{$_}[2][$i])*$hash{$_}[1])**2;
  }
}

my $t2 = time;

printf ( "total time: %10.4f error: %10.4f\n", $t2-$t1,$error);

with PDL

use strict;
use warnings;
use PDL;
use Time::HiRes qw(time);

my $t1 = time;
my %hash;
my $error = 0;

foreach (0 .. 10000){
  $hash{$_}[0] = pdl[0.000, 1.000, 2.0000];
  $hash{$_}[1] = pdl[10.0];
  $hash{$_}[2] = pdl[1.5,9.5,5.5];
  my $e = ($hash{$_}[0]-$hash{$_}[2])*$hash{$_}[1];
  $error += inner($e,$e);
}

my $t2 = time;

printf ( "total time: %10.4f error: %10.4f\n", $t2-$t1, $error);

score 5 · Answer 1 · edited Jun 20 '20 at 09:12

PDL is optimized to handle array computations. You are using a hash for your data but since the keys are numbers, it can be reformulated in terms of PDL array objects for a big win in performance. The following all PDL version of the example code runs about 36X faster than the original without PDL code (and 300X faster than the original with PDL code).

all PDL

use strict;
use warnings;
use PDL;
use Time::HiRes qw(time);

my $t1 = time;
my %hash;
my $error = 0;

my $pdl0 = zeros(3,10001);  # create a [3,10001] pdl
$pdl0 .= pdl[0.000, 1.000, 2.0000];

my $pdl1 = zeros(1,10001);  # create a [1,10001] pdl
$pdl1 .= pdl[10.0];

my $pdl2 = zeros(3,10001);  # create a [3,10001] pdl
$pdl2 .= pdl[1.5,9.5,5.5];

my $e = ($pdl0 - $pdl2)*$pdl1;
$error = sum($e*$e);

my $t2 = time;

printf ( "total time: %10.4f error: %10.4f\n", $t2-$t1, $error);

See the PDL Book for an in-depth intro to using PDL for computation. The PDL homepage is also a good starting point for all things PDL.

score 3 · Accepted Answer · answered Jun 19 '11 at 04:48

First, PDL is not going to help much unless the arrays are large. So instead of using a hash indexed by 0 to 10000, each with (basically) seven scalar elements, can you instead create seven PDL vectors of 10001 elements each and operate on those using vector operations?

Second, the expression $hash{$_} is being evaluated every time you name it, so you should factor it out. In your standard Perl code, for instance, you should do this:

my $vec = $hash{$_};
foreach my $i (0 .. 2){
    $error += (($vec->[0][$i]-$vec->[2][$i])*$vec->[1])**2;
}

Thanks Nemo. I was able to make modest gains with your suggestions. I'll paste the script below. Factoring is something I should be doing more often! — Demian, Jun 19 '11 at 17:18

score 3 · Answer 3 · answered Jun 19 '11 at 08:31

3

I refactored your code several times over, first moving as much complexity outside of the loop as possible. Second, I removed a layer or so of abstraction. This simplified the expression considerably, and cut the runtime by about 60% on my system while maintaining the same result.

use Modern::Perl;
use Time::HiRes qw(time);

my $t1 = time;
my $error = 0;

my @foo = ( 0.000, 1.000, 2.0000 );
my $bar = 10.0;
my @baz = ( 1.5, 9.5, 5.5 );

foreach ( 0 .. 10000 ) {
    $error += ( ( $foo[$_] - $baz[$_] ) * $bar )**2 for 0 .. 2
}

my $t2 = time;

printf ( "total time: %10.4f error: %10.4f\n", $t2-$t1,$error);

This is just plain old Perl; no PDL. Hopefully this is helpful to your project.

By the way, when calculating the time it takes for a section of code to run, I happen to prefer the Benchmark module, with its timethis(), timethese(), and cmpthese() functions. You get more information out of it.

answered Jun 19 '11 at 08:31

DavidO

13,812
3
38
66

Except for his actual application, I doubt he is operating on 10001 copies of the same data... I mean, you could "refactor" his code down to one line (return a constant), but that is probably not what he meant :-) – Nemo Jun 19 '11 at 12:47
thanks DavidO. Nemo is right. I'll edit the original post to make it clearer. – Demian Jun 19 '11 at 17:20
Yeah. I saw the challenge, and then after embarking upon it and posting, I saw the light. ;) It was a fun exercise nevertheless. – DavidO Jun 19 '11 at 19:19
1

You've inspired me to dig into Chromatic's Modern Perl at bedtime! – Demian Jun 20 '11 at 02:05

Demian · Answer 4 · 2011-06-19T17:55:09.957

Based on Nemo's suggestion, here is a PDL script that achieves modest gains in speed. I'm still PDL green so there probably is a better way. I also split up adding values to the hash into loops for references/weights and observations to make OP more like what is happening in the bigger program, see "description" above.

use strict;
use warnings;
use PDL;
use PDL::NiceSlice;
use Time::HiRes qw(time);

my $t1 = time;
my %hash;
my $nvals=10000;

#construct hash of references and weights
foreach (0 .. $nvals){
  $hash{$_} = [
                 [0.000, 1.000, 2.0000],
                 [10.0, 10.0, 10.0],
               ];
}

#record observations
foreach (0 .. $nvals){
  $hash{$_}[2] = [1.5,9.5,5.5]; 
}

my $tset = time;

my @ref;
my @obs;
my @w;

foreach (0 .. $nvals){
  my $mat = $hash{$_};
  push @ref, @{$mat->[0]};
  push @w,   @{$mat->[1]};
  push @obs, @{$mat->[2]};
}

my $ref = pdl[@ref];
my $obs = pdl[@obs];
my $w   = pdl[@w];

my $diff = (($ref-$obs)*$w)**2;
my $error = sum($diff);

my $t2 = time;

printf ( "$nvals time setup: %10.4f crunch: %10.4f total: %10.4f error: %10.4f\n", $tset-$t1,$t2-$tset, $t2-$t1,$error);

I am lumping the transformation from Hash of AoA to pdls into the performance of PDL. To be fair to PDL, we probably should just compare the efficiency of the regular perl error accumulation to pdls {my $diff = (($ref-$obs)*$w)**2; my $error = sum($diff)} where there are huge gains in speed for PDL. So to make this faster, I would need a better way to transfer from the hash of AoAs to the pdl structures. I should also start using the Benchmark module as suggested by DavidO. — Demian, Jun 19 '11 at 18:07

operating on AoAs stored in hash. PDL vs no PDL

without PDL

with PDL

4 Answers4

all PDL