4

I have an array of the form:

class anim {
    public $qs;
    public $dp;
    public $cg;
    public $timestamp;
}
$animArray = array();

$myAnim = new anim();
$myAnim->qs = "fred";
$myAnim->dp = "shorts";
$myAnim->cg = "dino";
$myAnim->timestamp = 1590157029399;
$animArray[] = $myAnim;

$myAnim = new anim();
$myAnim->qs = "barney";
$myAnim->dp = "tshirt";
$myAnim->cg = "bird";
$myAnim->timestamp = 1590133656330;
$animArray[] = $myAnim;

$myAnim = new anim();
$myAnim->qs = "fred";
$myAnim->dp = "tshirt";
$myAnim->cg = "bird";
$myAnim->timestamp = 1590117032286;
$animArray[] = $myAnim;

How do I create a new array containing only the non-duplicates (and the latest entry where duplicates are found) of $animArray, where a duplicate is defined as:

one where $myAnim->dp has the same value as that of another array element's $myAnim->dp AND the $myAnim->cg from the first and the $myAnim->cg from the second have the same value as each other.

In the example above, only the first element is unique by that definition.

I'm hoping there's an elegant solution. I've been through all the array functions in the PHP manual but can't see how it could be achieved.

I could loop through each array element checking if $myAnim->dp has the same value as that of another array element's $myAnim->dp, saving the matches into a new array and then looping through that new array, checking for its $myAnim->cg matching the $myAnim->cg of any other element in that new array.

A more elegant solution would allow me to to change which combination of key-value pairs determine whether there's a duplicate, without having to recast much code.

Does such a solution exist?

Thanks for helping this novice :)

3 Answers3

3

While there is nothing built-in that can be used directly out of the box, there isn't a lot of code necessary to handle an arbitrary number of properties to consider for uniqueness. By keeping track of each unique property in a lookup array, we can build an array where the leaf nodes (i.e. the ones that isn't arrays themselves) are the objects.

We do this by keeping a reference (&) to the current level in the array, then continue building our lookup array for each property.

function find_uniques($list, $properties) {
    $lookup = [];
    $unique = [];
    $last_idx = count($properties) - 1;

    // Build our lookup array - the leaf nodes will be the items themselves,
    // located on a level that matches the number of properties to look at
    // to consider a duplicate
    foreach ($list as $item) {
        $current = &$lookup;

        foreach ($properties as $idx => $property) {
            // last level, keep object for future reference
            if ($idx == $last_idx) {
                $current[$item->$property] = $item;
                break;
            } else if (!isset($current[$item->$property])) {
                // otherwise, if not already set, create empty array
                $current[$item->$property] = [];
            }

            // next iteration starts on this level as its current level
            $current = &$current[$item->$property];
        }
    }

    // awr only calls the callback for leaf nodes - i.e. our items.
    array_walk_recursive($lookup, function ($item) use (&$unique) {
        $unique[] = $item;
    });

    return $unique;
}

Called with your data above, and the requirement being that uniques and the last element of duplicates being returned, we get the following result:

var_dump(find_uniques($animArray, ['dp', 'cg']));

array(2) {
  [0] =>
  class anim#1 (4) {
    public $qs =>
    string(4) "fred"
    public $dp =>
    string(6) "shorts"
    public $cg =>
    string(4) "dino"
    public $timestamp =>
    int(1590157029399)
  }
  [1] =>
  class anim#3 (4) {
    public $qs =>
    string(4) "fred"
    public $dp =>
    string(6) "tshirt"
    public $cg =>
    string(4) "bird"
    public $timestamp =>
    int(1590117032286)
  }
}

Which maps to element [0] and element [2] in your example. If you instead want to keep the first object for duplicates, add an isset that terminates the inner loop if property value has been seen already:

foreach ($properties as $idx => $property) {
    if ($idx == $last_idx) {
        if (isset($current[$item->$property])) {
            break;
        }

        $current[$item->$property] = $item;
    } else {
        $current[$item->$property] = [];
    }

    // next iteration starts on this level as its current level
    $current = &$current[$item->$property];
}

It's important to note that this has been written with the assumption that the array you want to check for uniqueness doesn't contain arrays themselves (since we're looking up properties with -> and since we're using array_walk_recursive to find anything that isn't an array).

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • Thanks for your answer. Unfortunately, it doesn't correctly identify uniques when I extend the array data with the following: $myAnim = new anim(); $myAnim->qs = "wilma"; $myAnim->dp = "shorts"; $myAnim->cg = "bird"; $myAnim->timestamp = 1590117035383; $animArray[] = $myAnim; $myAnim = new anim(); $myAnim->qs = "pebbles"; $myAnim->dp = "tshirt"; $myAnim->cg = "bird"; $myAnim->timestamp = 1590117038461; $animArray[] = $myAnim; – Mark Highton Ridley May 24 '20 at 12:36
  • What's wrong about the answer? It'd be helpful if you could at least explain :-) – MatsLindh May 24 '20 at 15:12
  • I think I see what you're thinking of. Fixed. – MatsLindh May 24 '20 at 16:40
  • Thanks @MatsLindh - I'll try it again :) – Mark Highton Ridley May 24 '20 at 19:52
  • 2
    I made yours as the accepted answer, Mats even though @AbraCadaver's soltion worked perfectly as well, I did so because yours was more readable / understandable to a novice like me. – Mark Highton Ridley May 24 '20 at 21:31
2

This was fun:

array_multisort(array_column($animArray, 'timestamp'), SORT_DESC, $animArray);

$result = array_intersect_key($animArray,
          array_unique(array_map(function($v) { return $v->dp.'-'.$v->cg; }, $animArray)));
  • First, extract the timestamp and sort that array descending, thereby sorting the original array.
  • Then, map to create a new array using the dp and cg combinations.
  • Next, make the combination array unique which will keep the first duplicate encountered (that's why we sorted descending).
  • Finally, get the intersection of keys of the original array and the unique one.

In a function with dynamic properties:

function array_unique_custom($array, $props) {

    array_multisort(array_column($array, 'timestamp'), SORT_DESC, $array);

    $result = array_intersect_key($array,
              array_unique(array_map(function($v) use ($props) {
                  return implode('-', array_map(function($p) use($v) { return $v->$p; }, $props));;
              },
              $array)));

    return $result;
}
$result = array_unique_custom($animArray, ['dp', 'cg']);

Another option would be to sort it ascending and then build an array with a dp and cg combination as the key, which will keep the last duplicate:

array_multisort(array_column($animArray, 'timestamp'), SORT_ASC, $animArray);

foreach($animArray as $v) {
    $result[$v->dp.'-'.$v->cg] = $v;
}

In a function with dynamic properties:

function array_unique_custom($array, $props) {

    array_multisort(array_column($array, 'timestamp'), SORT_ASC, $array);

    foreach($array as $v) {
        $key = implode(array_map(function($p) use($v) { return $v->$p; }, $props));
        $result[$key] = $v;
    }
    return $result;
}
$result = array_unique_custom($animArray, ['dp', 'cg']);
AbraCadaver
  • 78,200
  • 7
  • 66
  • 87
  • 1
    Be aware that the usage of implode can create false duplicates; i.e. if one value's postfix matches the prefix of another value - `implode(['foo', 'bar'])` will give the same key as `implode(['foob', 'ar'])`. It'll be slightly better with a separation character, but again you might hit the same issue if that character is part of the value. – MatsLindh May 22 '20 at 19:21
  • @MatsLindh Good catch, added a delimiter. – AbraCadaver May 23 '20 at 18:31
  • @AbraCadaver: Your option works well: "Another option would be to sort it ascending and then build an array with a dp and cg combination as the key, which will keep the last duplicate" – Mark Highton Ridley May 24 '20 at 19:53
  • I already accepted another answer. I would like to accept both but if I accept yours, the other becomes unaccepted. – Mark Highton Ridley May 26 '20 at 22:08
0
//Create an array with dp and cg values only
$new_arr = [];
foreach($animArray as $key=>$item) {
    $new_arr[] = $item->dp.','.$item->cg;
}
$cvs = array_count_values($new_arr);
$final_array = [];
foreach($cvs as $cvs_key=>$occurences) {
    if ($occurences == 1) {
        $filter_key = array_keys($new_arr, $cvs_key)[0];         
        $final_array[$filter_key] = $animArray[$filter_key];    
    }
}

The final result would be (from your example) in $final_array:

[0] => anim Object
    (
        [qs] => fred
        [dp] => shorts
        [cg] => dino
        [timestamp] => 1590157029399
    )

Some explanation:

//Create a new array based on your array of objects with the attributes dp and cg
//with a comma  between them
$new_arr = [];
foreach($animArray as $key=>$item) {
    $new_arr[] = $item->dp.','.$item->cg;
}
/*
$new_arr now contains:

    [0] => shorts,dino
    [1] => tshirt,bird
    [2] => tshirt,bird
*/

//Use builtin-function array_count_values to get the nr of occurences for 
//each item in an array
$cvs = array_count_values($new_arr);

/*
$cvs would contain:

(
    [shorts,dino] => 1
    [tshirt,bird] => 2
)
*/

//Iterate through the $cvs array.
//Where there are only one occurence (no duplicates)
//create a final array $final_array
$final_array = [];
foreach($cvs as $cvs_key=>$occurences) {
    if ($occurences == 1) {

        /*
        array_keys with second argument $csv_key searches for key with 
        with the key from $cvs-key

        so basically search for:
        shorts,dino and retrieve the key 0 (first element)        
        */
        $filter_key = array_keys($new_arr, $cvs_key)[0];         

        /*
        Add a new item to the $final_array based on the key in
        the original array $animArray
        if you don't want the original key in the new array
        you could just do $final_array[] instead of 
        $final_array[$filter_key]
        */
        $final_array[$filter_key] = $animArray[$filter_key];    
    }
}

You said you would like to have some kind of functionality test different attributes. I believe it would just be making a function/method where you pass in two values to the arguments $attr1 ('dp'?), $attr2('cg'?) or similar.


UPDATE

I had not grasped that you wanted the last value as well. This actually seemed as an easier task. Maybe I am missing something but it was fun to come up with a different approach than other answers :-)

//Create an array with dp and cg values only
$new_arr = [];
foreach($animArray as $key=>$item) {
    $new_arr[] = $item->dp.','.$item->cg;
}

//Sort keys descending order
krsort($new_arr); 

//Because of sending order of keys above, the unique values would return the 
//last item of the duplicates
$new_arr2 = array_unique($new_arr); 

//Switch order of keys back to normal (ascending)
ksort($new_arr2); 

//Create a new array based on the keys set in $new_arr2
//
$final_arr = [];
foreach($new_arr2 as $key=>$item) {
    $final_arr[] = $animArray[$key];
}

The output of $final_arr[] would be (in your example)

Array
(
    [0] => anim Object
        (
            [qs] => fred
            [dp] => shorts
            [cg] => dino
            [timestamp] => 1590157029399
        )

    [1] => anim Object
        (
            [qs] => fred
            [dp] => tshirt
            [cg] => bird
            [timestamp] => 1590117032286
        )

)
bestprogrammerintheworld
  • 5,417
  • 7
  • 43
  • 72