2

Restating the question, why do duplicate entries of hashes in an array reference the first entry in Perl? Please correct my terminology if I am mistaken but, when I push identical hash references into an array with the code below:

use strict;
use warnings;
use Data::Dumper qw(Dumper);

my @array;
my %hash = (foo => 'foo', bar => 'bar');

for (1..3) {
    push @array, \%hash;
}

print Dumper @array;

I get the following result:

$VAR1 = { 
          'bar'=> 'bar',
          'foo'=> 'foo'             
        };
$VAR2 = $VAR1;
$VAR3 = $VAR1;

I expected to see the following result:

$VAR1 = { 
          'bar'=> 'bar',
          'foo'=> 'foo'             
        };
$VAR2 = { 
          'bar'=> 'bar',
          'foo'=> 'foo'             
        };
$VAR3 = { 
          'bar'=> 'bar',
          'foo'=> 'foo'             
        };

Is this behavior because of a fundamental Perl concept or because of Data::Dumper?

1 Answers1

6

Basically you already said it yourself (emphasys mine).

Please correct my terminology if I am mistaken but, when I push identical hash references into an array with the code below

That's because you push in a reference to the hash. In each iteration of the loop, it's always the same hash, so each new reference goes to that same hash.

If you use Data::Printer's p, the output is as follows, which I find clearer than the Data::Dumper one.

[
    [0] {
        bar   "bar",
        foo   "foo"
    },
    [1] var[0],
    [2] var[0]
]

It's obvious it's pointing to the same thing. Now if you run this code:

for (@array) {
    say $_;
}

The output will be three times the same address.

HASH(0x2755150)
HASH(0x2755150)
HASH(0x2755150)

The whole idea of a reference is to reference the same thing. That's a very powerful tool, because that way you save memory. If your $hash is actually not just a tiny hash, but a large object (let's say a WWW::Mechanize object that can hold parsed HTML documents), passing it around would be super expensive.

But with the reference, every piece of code that gets it shares it. That's way more efficient.

For more information on this, you should read perlref and perlreftut.


If, however, you want a copy, you need construct your own hash reference instead of referencing the hash you already have. That's what @toolic said in his comment.

my %hash = (foo => 'foo', bar => 'bar');

for (1..3) {
    push @array, { %hash };
}

And if you have an existing reference, and you want a copy, dereference it first.

my $ref = { foo => 'bar' };
my $shallow_copy = { %{ $ref } };

If you need a deeper copy, look at this answer.

Community
  • 1
  • 1
simbabque
  • 53,749
  • 8
  • 73
  • 136
  • @t I'm not going to try to find the meta discussion about whether to use _their_ around here on my phone and instead just assume you meant to specify it. Or maybe I just didn't get the joke. ;-) – simbabque Jan 28 '16 at 16:47
  • I edited to make it more clear that it's different references to the same data structure. @vlad, perlref explains it. Look for _reference count_. I think _Object Oriented Perl_ by Damian Conway also has a chapter explaining this stuff. – simbabque Jan 28 '16 at 20:30
  • @vlad: I know. But I answered it anyway. This is a crowdsourcing platform, we share knowledge here ;) – simbabque Jan 28 '16 at 20:57