Perl PDL : Search if a vector is in an array or in a matrix

Question

I try to make a grep like on a PDL matrix or array of Vector :

my @toto;
push(@toto, pdl(1,2,3));
push(@toto, pdl(4,5,6));
my $titi=pdl(1,2,3);
print("OK") if (grep { $_ eq $titi} @toto);

I also tried

my @toto;
push(@toto, pdl(1,2,3));
push(@toto, pdl(4,5,6));
my $titi=pdl(1,2,3);
print("OK") if (grep { $_ eq $titi} PDL::Matrix->pdl(\@toto));

None works.

Any help Please

score 3 · Accepted Answer · answered Feb 27 '19 at 15:00

Disclaimer: I don't know anything about PDL. I've read the source to figure this one out.

There's a function PDL::all() that you can use in conjunction with the overloaded comparison operator ==.

use PDL;
my $foo = pdl(1,2,3);
my $bar = pdl(4,5,6);
my $qrr = pdl(1,2,3);

print "OK 1" if PDL::all( $foo == $bar );
print "OK 2" if PDL::all( $foo == $qrr );

I'm still looking for the documentation.

Ed. · Answer 2 · 2022-03-12T03:58:55.533

The way to do this efficiently, in a way that scales, is to use PDL::VectorValued::Utils, with two ndarrays (the "haystack" being an ndarray, not a Perl array of ndarrays). The little function vv_in is not shown copy-pasted into the perldl CLI because it would be less copy-pastable from this answer:

sub vv_in {
  require PDL::VectorValued::Utils;
  my ($needle, $haystack) = @_;
  die "needle must have 1 dim less than haystack"
    if $needle->ndims != $haystack->ndims - 1;
  my $ign = $needle->dummy(1)->zeroes;
  PDL::_vv_intersect_int($needle->dummy(1), $haystack, $ign, my $nc=PDL->null);
  $nc;
}
pdl> p $titi = pdl(1,2,3)
[1 2 3]
pdl> p $toto = pdl([1,2,3], [4,5,6])
[
 [1 2 3]
 [4 5 6]
]
pdl> p $notin = pdl(7,8,9)
[7 8 9]
pdl> p vv_in($titi, $toto)
[1]
pdl> p vv_in($notin, $toto)
[0]

Note that for efficiency, the $haystack is required to be sorted already (use qsortvec). The dummy "inflates" the $needle to be a vector-set with one vector, then vv_intersect returns two ndarrays:

either the intersecting vector-set (which would always be a single vector here), or a set of zeroes (probably a shortcoming of the routine, it should instead be vectorlength,0 - an empty ndarray)
the quantity of vectors found (here, either 0 or 1)

The "internal" (_vv_intersect_int) version is used because as of PDL::VectorValued 1.0.15, it has some wrapping Perl code that does not permit broadcasting (an issue has been filed).

Note vv_in will "broadcast" (formerly known, confusingly, as "threading") over multiple sets of input-vectors and input-haystacks. This could be used to search for several vectors:

sub vv_in_multi {
  my ($needles, $haystack) = @_;
  die "needles must have same number of dims as haystack"
    if $needles->ndims != $haystack->ndims;
  vv_in($needles, $haystack->dummy(-1));
}
pdl> p vv_in_multi(pdl($titi,$notin), $toto)
[1 0]

score 1 · Answer 3 · answered Mar 13 '22 at 17:56

Thanks to Ed for the VectorValued shout-out above (and for the bug report too). On reflection, it occurs to me that if $toto is sorted (a la qsortvec(), as it is in your example), you can get away with using vsearchvec(), also from PDL::VectorValued::Utils and typically faster than vv_intersect (logarithmic vs. linear):

sub vv_in_vsearch {
  require PDL::VectorValued::Utils;
  my ($needle, $haystack) = @_;
  my $found = $needle->vsearchvec($haystack);
  return all($haystack->dice_axis(1,$found) == $needle);
}
pdl> $titi = pdl(1,2,3)
pdl> $tata = pdl(4,5,6)
pdl> $toto = pdl([1,2,3], [4,5,6])
pdl> $notin = pdl(7,8,9)
pdl> p vv_in_vsearch($titi, $toto)
1
pdl> p vv_in_vsearch($tata, $toto)
1
pdl> p vv_in_vsearch($notin, $toto)
0

(full disclosure: I wrote & maintain PDL::VectorValued)

Håkon Hægland · Answer 4 · 2019-03-01T09:47:43.237

0

You can use eq_pdl from Test::PDL:

use PDL;
use Test::PDL qw( eq_pdl );
my @toto;
push(@toto, pdl(1,2,3));
push(@toto, pdl(4,5,6));
my $titi = pdl(4,5,6);
print("OK\n") if (grep { eq_pdl( $_, $titi) } @toto);

Output:

OK

edited Mar 01 '19 at 09:47

answered Feb 27 '19 at 14:57

Håkon Hægland

39,012
21
81
174

thank you for you answer. It works if $titi is in first position of @toto. if $titi = pdl(4,5,6) for example it does not work... – Alexglvr Mar 01 '19 at 08:39
@Alexglvr It works fine for me. I wonder why it does not work for you? I have updated my answer to show what I tried. – Håkon Hægland Mar 01 '19 at 09:47
Note: If you only work with integers (not floating points), I would recommend the simpler solution in the other answer of @simbabque . I.e.: `print("OK\n") if (grep { PDL::all( $_ == $titi) } @toto);` – Håkon Hægland Mar 01 '19 at 10:43

Perl PDL : Search if a vector is in an array or in a matrix

4 Answers4