6

First of all I need to mention that I digged into manual and php docs and didnt find an answer. Here's a code I use:

class chomik {

    public $state = 'normal';
    public $name = 'no name';

    public function __construct($name) {
        $this->name = $name;
    }

    public function __toString() {
        return $this->name . " - " . $this->state;
    }
}

function compare($a, $b) {
    echo("$a : $b<br/>");
    if($a != $b) {
        return 0;
    }
    else return 1;
}

$chomik = new chomik('a');
$a = array(5, $chomik, $chomik, $chomik);
$b = array($chomik, 'b', 'c', 'd');
array_diff_uassoc($a, $b, 'compare');

What I thought, array_diff_uassoc will compare all values of these two arrays, and if values exists, then will run key comparison. And the output of this code is:

1 : 0
3 : 1
2 : 1
3 : 2
1 : 0
3 : 1
2 : 1
3 : 2
3 : 3
3 : 2
2 : 3
1 : 3
0 : 3

So first of all why some pairs (1 : 0 or 3 : 1) are duplicated? Does it mean function forgot that it already compared this items? I thought that it will compare all equal-by-value pairs, but I dont see it in output. Am I missing something?

So question is: what is exact behavior of this function in terms of order of comparison, and why I see this duplicates? (my PHP version, if it helps is: PHP Version 5.3.6-13ubuntu3.6)

I'm really confused, and waiting for some good explanation of it...

PeeHaa
  • 71,436
  • 58
  • 190
  • 262
Karol
  • 7,803
  • 9
  • 49
  • 67
  • 1
    You should probably use strict comparison !== not == in the compare function. – Mārtiņš Briedis Feb 27 '12 at 00:27
  • Comparison itself is not a big deal to be honest in this case. I'm wondering why `echo` is printing such results while comparing. And `echo` is triggered before comparison so it does not matter whether its strict or not I think. – Karol Feb 27 '12 at 00:33
  • What I wanted to achieve writing this code is: I want only these elements which are not in second array ($a[0]), and if they are in the second array, I want these elements which have the same key (index)... So of course function should return only $a[0] – Karol Mar 06 '12 at 01:18
  • I am also confused whether it is comparing the array indexes or something else. I don't think it actually compares the array indexes. Even if we had the 3 array comparison using array_diff_unassoc the passed parameters to the callback function are only 2. – Deepak Lamichhane Feb 16 '13 at 15:56
  • 1
    I have two arrays: `$array1 = array("a" => "a", "b" => "b", "c" => "c", "d" => "d");` `$array2 = array("x" => "x","y" => "y","z" => "z");` and it gives me this pairs: `b - a` `b - c` `d - b` `c - b` `d - c` `y - x` `z - y` `a - x` `a - y` `a - z` `b - x` `b - y` `b - z` `c - x` `c - y` `c - z` `d - x` `d - y` `d - z` I have no idea why this function compares elements from the same array.. – user1838937 Feb 02 '14 at 20:47
  • @user1838937 this is even more bizarre lol – Karol Feb 02 '14 at 23:06
  • I know it's old, but if this still remotely interests you, I roamed through the source code and came up with [a partial but convincing explanation](http://stackoverflow.com/a/29441375/576767) – Félix Adriyel Gagnon-Grenier Apr 04 '15 at 00:21

3 Answers3

0

from op's comment that

I want only these elements which are not in second array ($a[0])

can't you use array_diff($a, $b);? it returns

array(1) {
  [0]=>
  int(5)
}

otherwise,

The documentation states that:

The comparison function must return an integer less than, equal to, or greater than zero if the first argument is considered to be respectively less than, equal to, or greater than the second.

As I understand it, that means that the compare() function should be more like this:

function compare($a, $b) {
    echo("$a : $b<br/>");
    if($a === $b) return 0;
    else if ($a > $b) return 1;
    else return -1;
}

However even with this correction, it has very strange compare results:

1 : 0
1 : 2
3 : 1
2 : 1
3 : 2
1 : 0
1 : 2
3 : 1
2 : 1
3 : 2
0 : 0
1 : 0
1 : 1
2 : 0
2 : 1
2 : 2
3 : 0
3 : 1
3 : 2
3 : 3

I asked another question about this as it was getting out of the scope of an answer.

Community
  • 1
  • 1
0

I think you missed the return value section.

Returns an array containing all the entries from array1 that are not present in any of the other arrays.

the array keys are used in the comparison.

What is missing in the text is that the comparison is only done associatively. This means that any automatically declared or user defined numerical keys are typed as strings not integers.

So with

$one = array(a,b,c,'hot'=>d); // d has no match and  will be returned as array and go to the function alone
$two = array(a,b,c,d,e,f); //

Because $one hot=>d does not match $two 0=>d on an associative level $one hot=>d is returned.

Because of the PHP quirk of string and integer data type comparisons a user defined function can be used to enhance the comparison by using stronger comparison operations like ===.

This helps in situations where the type is ambiguous '0'=>d and 0=>d might look similar but are not until you say so in your code.

Luckily type hinting is coming to PHP7 to rid us of this type of weird construct and unclear documentation.

I am adding this from my comment because it pertains to your understanding of which php constructs are best used in your case. My comment:

I am not so sure about that since if($a != $b) { in their code is a problem. Since they are mistakenly using equality when they should be using identical operators !==. And they are using numerical keys in a construct designed for associative keys. they are probably also unaware of array_udiff which a better match for the data involved

Community
  • 1
  • 1
Carl McDade
  • 634
  • 9
  • 14
  • it is pretty clear op is very aware of what you're saying here. – Félix Adriyel Gagnon-Grenier Apr 03 '15 at 14:12
  • 1
    I am not so sure about that since if($a != $b) { in their code is a problem. Since they are mistakenly using equality when they should be using identical operators. And they are using numerical keys in a construct designed for associative keys. they are probably also unaware of array_udiff which a better match for the data involved – Carl McDade Apr 03 '15 at 16:28
0

This is somewhat intriguing indeed. I looked up the latest source of PHP on github (which is written in C++ as you probably know) and tried to make sense of it. (https://github.com/php/php-src/blob/master/ext/standard/array.c)

A quick search showed me that the function in question is declared on line 4308

PHP_FUNCTION(array_diff_uassoc)
{
    php_array_diff(INTERNAL_FUNCTION_PARAM_PASSTHRU, DIFF_ASSOC, DIFF_COMP_DATA_INTERNAL, DIFF_COMP_KEY_USER);
}

So that shows that the actual work is done by the php_array_diff function, that can be found in that same file on line 3938. It's a bit long to paste it here, 265 lines to be exact, but you can look it up if you want.

That is the point where I gave up. I have no experience in C whatsoever, and it is to late and I'm to tired to try and make sense of it. I suppose key comparison is done first, as it is probably more performant then comparing the values, but that is just a guess. Anyway, there is probably a good reason why they do it the way they do.

All that is just a long introduction to say, why would you want to put an echo inside your compare function in the first place? The goal of array_diff_uassoc is the output of the function. You should not rely on how the parser handles it. If they decide tomorrow to change the internal workings of that C function to ie. do the value comparison first, you'll get an entirely different result.

Perhaps you could use this replacement function that is written in php: http://pear.php.net/reference/PHP_Compat-1.6.0a2/__filesource/fsource_PHP_Compat__PHP_Compat-1.6.0a2CompatFunctionarray_diff_uassoc.php.html

That way you can rely on the behaviour to not change, and you have full control of the internal workings...

Pevara
  • 14,242
  • 1
  • 34
  • 47
  • this is line 3631 of your github link: `} else if (behavior & INTERSECT_ASSOC && key_compare_type == INTERSECT_COMP_KEY_USER) {`. line 3321 is an empty line. It is pretty clear the echo is not made for functionality but rather for testing, and it does show strange results. – Félix Adriyel Gagnon-Grenier Apr 03 '15 at 14:05
  • 1
    @FélixGagnon-Grenier sorry, I messed up the line numbers there (I actually downloaded the source from php.net and only later found out the source was on Github as well. Thought it would be easier to link there but forgot to check the line numbers. Ctrl/Cmd + F would have brought you there though) Updated now, but since the link points to the current master it is likely to change again in the future. – Pevara Apr 03 '15 at 17:17
  • 1
    The point of my answer is that you should consider that function a black box and not rely on it's inner workings. It is all about the API and the output, now *how* the function gets to that result, since that may change in the future. – Pevara Apr 03 '15 at 17:21