How to avoid duplicate values in an array of arrays?

Question

I know how to avoid duplicates in a one-dimensional array.

However, I have an array of arrays, and two lines of it may hold arrays with different references, but same values. I tried this:

sub unique {
     my %seen;
     grep !$seen{join('',$_)}++, @_ 
}

my @aa = (  ["1","2","3"],["1","2","3"],["1","2","4"] );
my @bb = unique(@aa);
print $_ for (@bb);

It should remove one of the two "123" arrays, but it doesn't. Probably because $_ holds a reference and not an array that can be joined? Of couse, I could loop through the $_ referenced array and concat all values, then use that as key to the %seen hash.

But I suspect there is a very elegant solution in Perl that I don't yet know of...

choroba · Accepted Answer · 2014-09-04T13:46:23.320

5

To fix your naive approach, you should dereference the array references in two places: when serializing and when printing:

# Assumes the elements don't contain the value of $; (0x1C by default)
sub unique {
     my %seen;
     grep ! $seen{ join $;, @$_ }++, @_
}

my @aa = (  ["1","2","3"],["1","2","3"],["1","2","4"] );
my @bb = unique(@aa);
print "@$_\n" for (@bb);

This could still give wrong output, imagine [ "1\x{1C}2", 3 ]. More complex stringification is needed if your data could contain such strings. Fortunately, Perl already has a way to serialize array references: Data::Dumper:

use Data::Dumper;

sub unique {
    my %seen;
    grep ! $seen{ Dumper $_ }++, @_
}

edited Sep 04 '14 at 13:46

answered Sep 04 '14 at 12:20

choroba

231,213
25
204
289

Works great. I'll go with the @$_ approach, since my real data structure will not allow such duplicates as in the example. May I ask, what the @$_ actually does? Is @ in front of an array-ref always returning the array itself? – jackthehipster Sep 04 '14 at 12:33
@jackthehipster: Yes. When dereferencing a more complex structure, you might need curly braces: `@{ $hash{key} }`. – choroba Sep 04 '14 at 12:58
2

Joining by `$;` or even single space seems optimal. – mpapec Sep 04 '14 at 12:59
+1: A down vote with comment would have helped a passer-by to understand the caveats with this solution as well as an opportunity for the answerer to fix or correct the solution. – jaypal singh Sep 04 '14 at 13:26
@jackthehipster, No, it doesn't return the array itself.. In this case, it returns the elements of the referenced array, just like `@a` would return the elements of `@a` in that situation. – ikegami Sep 04 '14 at 13:44
@jaypal: Are you saying your upvote is because of the previous downvote? – Borodin Sep 04 '14 at 14:11
1

Absolutely not @Borodin. I am **not** choroba's sock puppet, if thats what you were trying to get at (though I do up vote a lot of his answers but mainly because they are good but I do that for yours mpapec, ikegami, TLP too.. `:)`). My _upvote_ was due to the fact that I would have solved it the same way and my _comment_ was for the down voter to leave a feedback on what could potentially go wrong with this approach so that **I** can learn from it. – jaypal singh Sep 04 '14 at 14:16
1

@jaypal: Ah OK. It's just that your comment to go with the +1 was about only the downvote. – Borodin Sep 04 '14 at 14:50

How to avoid duplicate values in an array of arrays?

1 Answers1