11

Just wonder if I am given two arrays, A and B, how to remove/delete those elements in A that can also be found in B? What is the most efficient way of doing this?

And also, as a special case, if B is the resulting array after grep on A, how to do this? Of course, in this case, we can do a grep on the negated condition. But is there something like taking a complement of an array with respect to another in perl?

Thank you.

Qiang Li
  • 10,593
  • 21
  • 77
  • 148
  • As a special case, if the two arrays are sorted, you can do a more efficient differencing operation. But it doesn't seem like that's what you're after. – Mike Sokolov Sep 23 '11 at 01:25
  • Related: http://stackoverflow.com/questions/3700037/how-can-i-represent-sets-in-perl – daxim Sep 23 '11 at 11:21

6 Answers6

7

Any time you are thinking of found in you are probably looking for a hash. In this case, you would create a hash of your B values. Then you would grep A, checking the hash for each element.

my @A = 1..9;
my @B = (2, 4, 6, 8);
my %B = map {$_ => 1} @B;

say join ' ' => grep {not $B{$_}} @A; # 1 3 5 7 9

As you can see, perl is not normally maintaining any sort of found in table by itself, so you have to provide one. The above code could easily be wrapped into a function, but for efficiency, it is best done inline.

Eric Strom
  • 39,821
  • 2
  • 80
  • 152
  • 1
    mod +1. The `map` itself is pretty impressive, but the use of `grep` is amazing. It took me a while to realize what it was doing. I was wondering what you were _grepping_ before realizing that you were not actually using `grep` to match the line. If the statement `not $B{$_}` is true (and it will be for all keys not in `%B`), the value of `$_` is kept in the array that the `grep` command returns. – David W. Sep 23 '11 at 03:05
  • I like this, but what if I also want to main the order in `@A` after deleting all elements in `@B`? Is there anything in perl like `LinkedHashMap` in java? – Qiang Li Sep 23 '11 at 04:47
  • 3
    @Qiang Li, His code does maintain the order of the elements in `@A`. (Assuming main = maintain) – ikegami Sep 23 '11 at 05:55
  • 2
    @Qiang Li, Tie::IxHash is one way of creating an ordered associative array (like LinkedHashMap), but I don't see what that has to do with your question or this solution. – ikegami Sep 23 '11 at 05:57
  • 1
    @Zaid: There will not be any autovivification in this case. You should try it before you write your comments. You don't know Perl as much as you expect. – Hynek -Pichi- Vychodil Sep 23 '11 at 13:40
  • @Hynek-Pichi-Vychodil : You're right, I'm wrong. Comment removed. :) – Zaid Sep 23 '11 at 16:18
3

Have a look at the none, all, part, notall methods available via List::MoreUtils. You can perform pretty much any set operation using the methods available in this module.

There's a good tutorial available at Perl Training Australia

RET
  • 9,100
  • 1
  • 28
  • 33
1

If you ask for most efficient way:

my @A = 1..9;
my @B = (2, 4, 6, 8);

my %x;
@x{@B} = ();
my @AminusB = grep !exists $x{$_}, @A;

But you will notice difference between mine and Eric Strom's solution only for bigger inputs.

You can find handy this functional approach:

sub complementer {
  my %x;
  @x{@_} = ();
  return sub { grep !exists $x{$_}, @_ };
}

my $c = complementer(2, 4, 6, 8);

print join(',', $c->(@$_)), "\n" for [1..9], [2..10], ...;

# you can use it directly of course
print join(' ', complementer(qw(a c e g))->('a'..'h')), "\n";
Community
  • 1
  • 1
Hynek -Pichi- Vychodil
  • 26,174
  • 5
  • 52
  • 73
  • It's not any more *effective* than the other working solutions (by definition). Maybe you meant *efficient*? – ikegami Sep 23 '11 at 19:53
0

You're probably better off with the hash, but you could also use smart matching. Stealing Eric Strom's example,

my @A = 1..9;
my @B = (2, 4, 6, 8);

say join ' ' => grep {not $_ ~~ @B } @A; # 1 3 5 7 9
Community
  • 1
  • 1
oylenshpeegul
  • 3,404
  • 1
  • 18
  • 18
  • 3
    This doesn't scale nearly as well as Eric Strom's. His solution is worst case Θ(A+B), but yours is Θ(A*B) – ikegami Sep 23 '11 at 03:41
0

Again, you're probably better off with the hash, but you could also use Perl6::Junction. Again stealing Eric Strom's example,

use Perl6::Junction qw(none);

my @A = 1..9;
my @B = (2, 4, 6, 8);

say join ' ' => grep {none(@B) == $_} @A; # 1 3 5 7 9
Community
  • 1
  • 1
oylenshpeegul
  • 3,404
  • 1
  • 18
  • 18
-1

As already mentioned by Eric Strom, whenever you need to search for something specific, it's always easier if you have a hash.

Eric has a nicer solution, but can be difficult to understand. I hope mine is easier to understand.

# Create a B Hash

my %BHash;
foreach my $element (@B) {
   $BHash{$element} = 1;
}

# Go through @A element by element and delete duplicates

my $index = 0;
foreach my $element (@A) {
   if (exists $BHash{$element}) { 
      splice @A, $index, 1;    #Deletes $A[$index]
      $index = $index + 1;
   }
}

In the first loop, we simply create a hash that is keyed by the elements in @B.

In the second loop, we go through each element in @A, while keeping track of the index in @A.

Community
  • 1
  • 1
David W.
  • 105,218
  • 39
  • 216
  • 337
  • 3
    It fails because it modifies the array over which it iterates. And all that extra complexity deters from its readability. – ikegami Sep 23 '11 at 03:45
  • @ikegami: Would it make it more readable to copy the array over to a new array, then rename it back to the original one? This is not the way I'd do it. I was trying for readability. – David W. Sep 23 '11 at 18:34
  • You should be worrying about making it work, first. `my @C; for my $e (@A) { push @C, $e if !$B{$e}; } @A = @C;` would make it work, but it's a really complicated way to do `@A = grep { !$B{$e} } @A;`. – ikegami Sep 23 '11 at 19:47