1

Is it possible to compare a string vs. a string one level deeper (in an array) using array_uintersect()? Or will there be some sorting taking place and those params be swapped behind the scenes when serving them to the value compare function (callback)?

The purpose is to remove duplicates from $urls.

$urls:Array
(
    [0] => Array
        (
            [url] => https://www.example.com/
            [parent_url] => https://www.example.com/bleh/bleh.aspx
        )

    [1] => Array
        (
            [url] => https://www.example.com/
            [parent_url] => https://www.example.com/bla/bla.aspx
        )
)

$urls_uniq: Array
(
    [1] => https://www.example.com/
    [2] => https://www.example.com/go/173.aspx
)

function compareDeepValue($val1, $val2)
{
   if (is_array($val1) && empty($val1)){
    return 0;
   }

   // here I assume val1 is always an array (elements
   // from $urls) and val2 is always a string (elements from urls_uniq)
   return strcmp($val1['url'], $val2);  
}

$intersect = array_uintersect($urls, $urls_uniq, 'compareDeepValue');

It gives me this error at the callback function (swapping vars does not help):

strcmp(): Argument #1 ($string1) must be of type string, array given

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Mat90
  • 169
  • 1
  • 9

2 Answers2

0

The strcmp function requires string parameters for comparison.

Do all instances of a url in $urls from $urls_uniq need to be removed, or should there just be 1 instance regardless of the parent_url?

If array_uintersect is absolutely necessary, I'd try === comparisons instead of strcmp.

  • I know strcmp requires strings, theferore I used the ['url'] construction (string within array). Unique urls in $urls should be kept (regardless of parent_url). – Mat90 Dec 23 '21 at 20:05
0

As an optimization, array_uintersect() is actually doing to some sorting under the hood, this is why it is expecting the three-way comparison return value (-1, 0, 1). This means that you cannot assume that the first parameter fed to your callback function will always be the values in the first array and the second parameter being values from your second array. This also means that you should not simply return 0 on qualifying outcomes and 1 on disqualifying outcomes.

In my snippet below, you will see that I am null coalescing while trying to access the ['url'] element of the variable. For variables that are not arrays or do not have the url key, the variable itself is used -- this is appropriate for your sample data. For stability use a technique that provides a 3-way return value.

Code: (Demo)

$urls = [
    ['url' => 'https://www.example.com/', 'parent_url' => 'https://www.example.com/bleh/bleh.aspx'],
    ['url' => 'https://www.example2.com/', 'parent_url' => 'https://www.example2.com/blar.aspx'],
    'not an array',
    ['url' => 'https://www.example.com/', 'parent_url' => 'https://www.example.com/bla/bla.aspx'],
];

$urls_uniq = ['https://www.example.com/', 'https://www.example.com/go/173.aspx'];

var_export(
    array_uintersect(
        $urls,
        $urls_uniq,
        function($a, $b) {
            // echo json_encode($a) . ' VS ' . json_encode($b) . "\n"; // uncomment to see what's happening
            // return ($a['url'] ?? null) === $b ? 0 : 1;  // this doesn't work.
            // return ($a['url'] ?? $a) === ($b['url'] ?? $b) ? 0 : 1;  // this doesn't work
            return strcmp($a['url'] ?? $a, $b['url'] ?? $b);
            // or return ($a['url'] ?? $a) <=> ($b['url'] ?? $b);
        }
    )
);

Output:

array (
  0 => 
  array (
    'url' => 'https://www.example.com/',
    'parent_url' => 'https://www.example.com/bleh/bleh.aspx',
  ),
  3 => 
  array (
    'url' => 'https://www.example.com/',
    'parent_url' => 'https://www.example.com/bla/bla.aspx',
  ),
)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136