5

The function signature on PHP.net for array_replace() says that the arrays will be passed in by reference. What would be the reason(s)/benefit(s) to doing it this way rather than by value since to get the intended result you must return the finished array to a variable. Just to be clear, I am able to reproduce the results in the manual, so this is not a question on how to use this function.

Here is the function signature and an example, both from php.net.

Source: http://ca3.php.net/manual/en/function.array-replace.php

Function signature:

array array_replace ( array &$array , array &$array1 [, array &$... ] )

Example code:

$base = array("orange", "banana", "apple", "raspberry");
$replacements = array(0 => "pineapple", 4 => "cherry");
$replacements2 = array(0 => "grape");

$basket = array_replace($base, $replacements, $replacements2);
print_r($basket);

The above example will output:

Array
(
    [0] => grape
    [1] => banana
    [2] => apple
    [3] => raspberry
    [4] => cherry
)
Jon Lyles
  • 345
  • 1
  • 3
  • 12
  • 2
    The reason is very simple: http://me.veekun.com/blog/2012/04/09/php-a-fractal-of-bad-design/ :) – biziclop Jun 22 '12 at 13:57
  • 1
    @biziclop That article is a helluva long whine. He's pretty up front about just not liking PHP, so of course the article is biased negatively against it. Doesn't mean he's 100% correct. – WWW Jun 22 '12 at 14:06
  • Btw, the documentation page for `array_replace` has recently been updated - and the error we've been talking about is gone now. So I guess we can make this world better after all. ) – raina77ow Jul 04 '12 at 11:44

4 Answers4

4

This function which calls php_array_merge_or_replace_wrapper which calls zend_hash_merge which in turn calls _zend_hash_merge etc. etc. etc. leads to an underlying memcmp() call which is probably ultimately why the arrays get passed into PHP's array_replace() by reference (because memcmp() requires them to be).

Arrays are one of the aspects of PHP that just seem to work and rarely get questioned, and I can kind of see why after doing a little digging.

WWW
  • 9,734
  • 1
  • 29
  • 33
  • You probably meant `into PHP's array_replace() by reference` - [array_merge() signature](http://php.net/manual/en/function.array-merge.php) is officially referenceless. ) – raina77ow Jun 22 '12 at 14:42
  • @Crontab appreciate the answer. I get it now. – Jon Lyles Jun 22 '12 at 14:51
  • are people upvoting this because they're familiar with php's inner workings, and so you agree with the whole memcmp thing? I don't buy it... – goat Jun 22 '12 at 14:51
  • @rambocoder Did you read through the code links? They go right to PHP's source code (and the memcmp() link goes to GNU's libc manual), I'm not sure what other references would be more authoritative... – WWW Jun 22 '12 at 14:54
  • sorry, my comment wasn't clear. I feel your answer implies php had to implement it this way. For example, maybe if there was a performance reason they did it this way, then I feel that would be more of an answer. But currently it just talks about the source code, which obviously works as is, but doesnt answer the why. – goat Jun 22 '12 at 15:03
  • 1
    @rambocoder There are scant few comments in the PHP source code to explain why they chose to do things the way they did. I don't think I deserve a downvote just because I didn't call up Rasmus and ask him to explain himself. – WWW Jun 22 '12 at 15:35
3

Well, the point is that _zend_hash_merge function is used not only by array_merge - but also by + operator (when both it operands are arrays).

And while there are some differences in processing, none of them actually can be attributed to the difference in requirements: as far as I know, no one writes + as &$arr + &$arr, it just makes no sense.

So I suppose it's just an error in documentation.

But one can come to this conclusion without analyzing the abyss of PHP internal code. ) Remember, we use &$array notation when we pass an array that can be (and most probably will) be changed - see, for example, array_splice() signature. And (this can be checked very easily) array_replace doesn't change its arguments - at least, at present. )

UPDATE: well, now I'm angry. If some PHP dev, God bless his soul, actually think that it's not a bug in documentation, let him/her explain why this:

array_pop(array('a' => 1));

... triggers a fatal error (Only variables can be passed by reference), and this...

array_replace(array('a' => 1), array('b' => 2));

... will just work, as nothing happened.

Or do we have two types of references in PHP now?

raina77ow
  • 103,633
  • 15
  • 192
  • 229
  • I thought it was a bug also, so I submitted a bug, and I got a reply from someone that this wasn't a bug because the function works as described in the documentation. But it also says in the documentation that "You can pass a variable by reference to a function so the function can modify the variable". – Jon Lyles Jun 22 '12 at 14:55
  • 2
    Updated the answer. Amazing: each time I think PHP supporting community won't surprise me anymore, it just manages to. – raina77ow Jun 22 '12 at 15:39
  • 1
    php doesnt always complain about that. eg `current(array(1));` but...I think [current](http://www.php.net/current)() is just mis documented as needing a reference. – goat Jun 22 '12 at 19:39
1

hypothesis:

Since passing by value involves copying the array, I guess it is faster to pass them by reference.

test it:

<?php 

function ref(array &$array) {
    for($i = 0; $i < count($array); $i++) {
        $array[$i] == 'foo'; //just accessing
    }
}

function val(array $array) {
    for($i = 0; $i < count($array); $i++) {
        $array[$i] == 'foo'; //just accessing
    }
}


//create large array
$array = array();
for($i = 0; $i < 100; $i++) {
    $array[] = $i;
}


echo "Pass by reference\n";
$t1 = microtime(true);
for($i = 0; $i < 10000; $i++) {
    ref($array);
}
$t2 = microtime(true);
echo $t2 - $t1 . "s\n\n";

echo "Pass by value\n";
$t1 = microtime(true);
for($i = 0; $i < 10000; $i++) {
    val($array);
}
$t2 = microtime(true);
echo $t2 - $t1 . "s\n\n";

outputs:

Pass by reference
8.3282010555267s

Pass by value
1.4845979213715s

conclusion:

Obviously it's not for performance reasons.

Roman
  • 5,888
  • 26
  • 47
  • Try calling `ref(array('a' => 1))` and `val(array('a' => 1))` instead. See the difference that small `&` symbol makes? ) – raina77ow Jun 22 '12 at 15:42
  • @raina77ow - actually, passing by reference seems to be slower (look at the test results), so the whole `it's for performance` idea is wrong anyway. – Roman Jun 22 '12 at 15:44
1

it was a documentation bug, and has now been fixed.

https://bugs.php.net/bug.php?id=62383

goat
  • 31,486
  • 7
  • 73
  • 96