7

I'm curious if Perl internals creates a copy of the ref values to create the array? For example, the following outputs the last and first value of a delimited string:

say @{[ split( q{\|}, q{bar|is|foo} ) ]}[-1,0];     # STDOUT: foobar\n
  • Does the operation first generate a list via split and create an array ref, then copy the values of the array ref into a new array when dereferencing?
  • Does it morph the current arrayref in place?

Because dereferencing is so common I'm sure it's optimized, I'm just curious how expensive it is versus creating an array from the list initially, like:

my @parts = split q{\|}, q{bar|is|foo};
say @parts[-1,0];

Purpose: getting an idea of the underlying operations w/o getting too deep into the code

tshepang
  • 12,111
  • 21
  • 91
  • 136
vol7ron
  • 40,809
  • 21
  • 119
  • 172
  • 1
    BTW, the usual way is `say +(split /\|/, 'bar|is|foo')[-1, 0]`. – choroba Sep 24 '13 at 14:30
  • @choroba: yes, I find that clearer, but I've found out people aren't familiar with that notation and don't know how to look it up, since `+` is so common – vol7ron Sep 24 '13 at 14:35
  • 2
    You're doing more than just dereferencing an array. You additionally create an array and a reference and assign the list to that array. – ikegami Sep 24 '13 at 15:29
  • @choroba: (looks like my other comment didn't save) but I'd be more likely to use parentheses vs unary plus; eg `say( (split /\|/, 'bar|is|foo')[-1, 0] );` – vol7ron Sep 24 '13 at 15:34
  • @ikegami yes; from the example, you are 100% unequivocally absolutely positively correct; except im dereferencing a ref, not an array ;) But the question still stands – vol7ron Sep 24 '13 at 15:55
  • "dereferencing an array" is short for "dereferencing a reference to an array" – ikegami Sep 25 '13 at 02:06
  • The question still stands? That's dumb. It's a useless question. It doesn't matter how long it takes to deref an array if you never do just that extra. – ikegami Sep 25 '13 at 02:08
  • @ikegami: dumb? useless? It seems you view *expensive* in units of time; whereas, I interpret it in multiple forms: time, processing cycles, bandwidth, energy consumption, the effect on other synchronous/asynchronous processes. You limit the question and then call it dumb - why? Not saying this happens, but if Perl doubles the memory to convert a ref into an array, or arrays have different methods that require more resources, I feel the question is relevant. Even if it doesn't, it's still a relevant question, just the answer doesn't impact current practice. – vol7ron Sep 25 '13 at 13:28
  • Re "dumb? useless?", Again, what's the point of measuring A if the alternatives are B and C? The measurements of A are **useless** since you have nothing to compare them against. Doesn't matter if you're measuring CPU, memory, power, cycles, bandwidth. It's **dumb** to say you don't care and that you want that information anyway. – ikegami Sep 25 '13 at 13:53

2 Answers2

3

Here is a Benchmark

#!/usr/bin/perl 
use strict;
use warnings;
use 5.010;
use Benchmark qw(:all);

my @list = ('foo')x1_000_000;
my $str = join('|',@list);
my $count = -2;
cmpthese($count, {
    'deref' => sub {
        my $parts = [ split( q{\|}, $str ) ];
        my @res = @$parts[-1,0];
    },
    'array' => sub {
        my @parts = split q{\|}, $str;
        my @res =  @parts[-1,0];
    },
});

I just change say to an assignement.
Windows 7, perl 5.14.2

        Rate deref array
deref 2.02/s    --  -38%
array 3.23/s   60%    --

Depending of environment, I get
Linux 64 bit, perl 5.14.2

        Rate deref array
deref 3.00/s    --  -35%
array 4.65/s   55%    --

and Linux 32 bit, perl 5.8.4

        Rate array deref
array 1.96/s    --  -35%
deref 3.00/s   53%    --
Toto
  • 89,455
  • 62
  • 89
  • 125
  • 1
    I'm getting only 1% difference. – mpapec Sep 24 '13 at 14:35
  • @mpapec: I got 3% in 5.10.1. The `my @res = (split /\|/, 'bar|is|foo')[-1, 0];` line got 63 and 58, respectivelly, though. – choroba Sep 24 '13 at 14:43
  • @mpapec: I get comparable numbers on perl 5.14.2, but only a 3% difference on perl 5.8.8. Apparently, the array version got optimized between those two. – frezik Sep 24 '13 at 14:43
  • Took at the assignment, since there's more that goes on with it than with output, but: `perl -MBenchmark=:all -E 'my $string = q{bar|is|foo};cmpthese(-2,{deref=>sub{@{[split q{\|},$string]}[-1,0];},array=>sub{@parts=split q{\|},$string;@parts[-1,0];}})'` ; `array:88%;deref:-47%` comment if it looks wrong – vol7ron Sep 24 '13 at 14:44
  • @mpapec: OK, I get approx the same differences. – Toto Sep 24 '13 at 14:51
  • benchmark with thousands of elements; that better tests the exact question being asked. – ysth Sep 24 '13 at 14:51
  • @ysth: yeah, that was kind of what I was getting at (some large delimited string) – vol7ron Sep 24 '13 at 14:53
  • @ysth: I change the length of the array to 1 million element, the differences are larger. – Toto Sep 24 '13 at 14:59
  • @mpapec: with the new version on perl 5.14.2, I get deref 449322/s, array 886485/s – frezik Sep 24 '13 at 15:17
1

vol7ron> How expensive is it to dereference an array ref in Perl?

ikegami> You're doing more than just dereferencing an array.

vol7ron> But the question still stands

Again, this is a useless question. The alternative is never between simply dereferencing an array and something else.

But since you insist, it's 37 ns (37 billionths of a second) for me.

use Benchmark qw( cmpthese );

my %tests = (
   deref => 'my @y = @$x;',
   none  => 'my @y = @x;',
);

$_ = 'use strict; use warnings; our $x; our @x; ' . $_
   for values %tests;

{
   local our @x = ();
   local our $x = \@x;
   cmpthese(-3, \%tests);
}

Result:

           Rate deref  none
deref 3187659/s    --  -12%
none  3616848/s   13%    --

Time taken by each deref = 1/3187659 s - 1/3616848 s = 37 ns

It's tiny! Dereferencing the array only accounts for 12% of the time taken to dereference an empty array and copying it into another!

Does the operation first generate a list via split (1) and create an array ref (2), then copy the values of the array ref into a new array when dereferencing (3)?

  1. Yes, split returns a list. Except in scalar context.

  2. [ ... ] doesn't just create a reference, it also creates an array and copies the values into it.

  3. No, dereferencing doesn't copy values.

Does it morph the current arrayref in place?

It would be really bad if a reference turned into something else. What do you actually mean?

Community
  • 1
  • 1
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • Again, you're translating ***expense*** into only computational units per second. How much of a memory difference is it? If you have 200MB in memory and that doubles when converting a ref to an array, I'd call the doubling an expense. I could think of many things that factor into an expense, in addition to real/cpu time. I have respect for you, but the fact that you are calling that it a useless question has me debating a downvote - *not that you care* :) – vol7ron Sep 25 '13 at 15:29
  • As for the examples, I agree that they might not be the best to illustrate the question I prosed. Because it's an array ref and a reference is generally only a pointer, there may be no expense, it may have all the methods attached already and the difference is sticking a sigil in front of it - and that'd be an acceptable answer; however, this is Perl and there may be other things going on when going from an array ref to an array. – vol7ron Sep 25 '13 at 15:32
  • I never said it wasn't a suitable example. I will say there is no suitable example. The choice will never be between dereferencing and not dereferencing or between dereferencing and something else. – ikegami Sep 25 '13 at 16:16
  • So given `$x=[0..10]`, the following wouldn't be a valid choice? `$x->[2]` vs `@{$x}[2]` – vol7ron Sep 25 '13 at 18:47
  • In response to your answer, I think I was getting at your answer #3 at the bottom, regarding copying. In response to the question morphing, I was curious if by dereferencing if any polymorphism occurred. If refs really are just *pointers* then I wouldn't expect that to be true, but since Perl has underlying *magic*, that's why I asked. – vol7ron Sep 25 '13 at 18:51
  • @vol7ron, huh, those are both deferences. – ikegami Sep 25 '13 at 21:27
  • @vol7ron, That doesn't clarify anything. You're still saying you think Perl might change a reference into something else when you deference it, but then `$x->[4]; $x->[4]` would fail. I presume that's not what you mean, so my request for clarification stands. – ikegami Sep 25 '13 at 21:41
  • there seems to be a lot of things standing :) – vol7ron Sep 25 '13 at 22:15