5

I need to remove rows from my input array where duplicate values occur in a specific column.

Sample array:

$array = [
    ['user_id' => 82, 'ac_type' => 1],
    ['user_id' => 80, 'ac_type' => 5],
    ['user_id' => 76, 'ac_type' => 1],
    ['user_id' => 82, 'ac_type' => 1],
    ['user_id' => 80, 'ac_type' => 5]
];

I'd like to filter by user_id to ensure uniqueness and achieve this result:

So, my output will be like this:

[
    ['user_id' => 82, 'ac_type' => 1],
    ['user_id' => 80, 'ac_type' => 5],
    ['user_id' => 76, 'ac_type' => 1]
]

I've already tried with:

$result = array_unique($array, SORT_REGULAR);

and

$result = array_map("unserialize", array_unique(array_map("serialize", $array)));

and

$result = array();
foreach ($array as $k => $v) {
    $results[implode($v)] = $v;
}
$results = array_values($results);
print_r($results);

but duplicate rows still exist.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
Roxx
  • 3,738
  • 20
  • 92
  • 155
  • Well, `array_unique()` will only match full duplicates. And you just want it to match the user id. So obviously that isn't going to work. I faced a simular problem like this a while ago. Trying to find it so I remember what I did to solve it – icecub Aug 10 '17 at 03:02
  • @icecub duplicates are almost identical. Array 0 and 3 are identical but array_unique not working in this case as well. – Roxx Aug 10 '17 at 03:05
  • Why don't you try to get rid of the duplicate before you add the data into your array. If you're pulling data from databases then you can match the results and remove the similar values then insert that in to the array. – S4NDM4N Aug 10 '17 at 03:06
  • @Sand data is coming from two separate query. They are added to array. I am not sure where do you want me to remove the duplicates. – Roxx Aug 10 '17 at 03:09
  • I have found a good solution here for the same problem here https://vijayasankarn.wordpress.com/2017/02/20/array_unique-for-multidimensional-array – Mahfuzul Hasan Mar 09 '23 at 21:45

4 Answers4

8

For a clearer "minimal, complete, verifiable example", I'll use the following input array in my demos:

$array = [
    ['user_id' => 82, 'ac_type' => 1],
    ['user_id' => 80, 'ac_type' => 5],
    ['user_id' => 76, 'ac_type' => 1],
    ['user_id' => 82, 'ac_type' => 2],
    ['user_id' => 80, 'ac_type' => 5]
];
// elements [0] and [3] have the same user_id, but different ac_type
// elements [1] and [4] have identical row data
  1. Unconditionally push rows into a result array and assign associative first-level keys, then re-index with array_values(). This approach overwrites earlier duplicate rows with later occurring ones.

    array_column demo:

    var_export(array_values(array_column($array, null, 'user_id')));
    

    foreach demo:

    $result = [];
    foreach ($array as $row) {
        $result[$row['user_id']] = $row;
    }
    var_export(array_values($result));
    

    Output:

    [
        ['user_id' => 82, 'ac_type' => 2], // was input row [3]
        ['user_id' => 80, 'ac_type' => 5], // was input row [4]
        ['user_id' => 76, 'ac_type' => 1]  // was input row [2]
    ]
    
  2. Use a condition or the null coalescing assignment operator to preserve the first occurring row while removing duplicates.

    foreach null coalescing assignment demo:

    foreach ($array as $a) {
        $result[$a['user_id']] ??= $a; // only store if first occurrence of user_id
    }
    var_export(array_values($result)); // re-index and print
    

    foreach isset demo:

    foreach ($array as $a) {
        if (!isset($result[$a['user_id']])) {
            $result[$a['user_id']] = $a; // only store if first occurrence of user_id
        }
    }
    var_export(array_values($result)); // re-index and print
    

    Output:

    [
        ['user_id' => 82, 'ac_type' => 1], // was input row [0]
        ['user_id' => 80, 'ac_type' => 5], // was input row [1]
        ['user_id' => 76, 'ac_type' => 1]  // was input row [2]
    ]
    
  3. It is also possible to unconditionally push data AND avoid a condition, but the row order may differ between the input and output (if it matters to you).

    array_reverse, array_column demo:

    var_export(array_values(array_column(array_reverse($array), null, 'user_id')));
    

    array_reduce demo:

    var_export(
        array_values(
            array_reduce(
                $array,
                fn($res, $row) => array_replace([$row['user_id'] => $row], $res),
                []
            )
        )
    );
    

    foreach array_reverse demo:

    $result = [];
    foreach (array_reverse($array) as $row) {
        $result[$row['user_id']] = $row;
    }
    var_export(array_values($result));
    

    Output:

    [
        ['user_id' => 80, 'ac_type' => 5], // was input row [1]
        ['user_id' => 82, 'ac_type' => 1], // was input row [0]
        ['user_id' => 76, 'ac_type' => 1]  // was input row [2]
    ]
    

A warning about a fringe case not expressed in this example: if you are using row values as identifiers that may be corrupted upon being used as keys, the above techniques will give unreliable results. For instance, PHP does not allow float values as keys (they will cause an error or be truncated, depending on your PHP version). Only in these fringe cases might you consider using inefficient, iterated calls of in_array() to evaluate uniqueness.


Using array_unique(..., SORT_REGULAR) is only suitable when determining uniqueness by ENTIRE rows of data.

array_unique demo:

var_export(array_unique($array, SORT_REGULAR));

Output:

[
    ['user_id' => 82, 'ac_type' => 1], // was input row [0]
    ['user_id' => 80, 'ac_type' => 5], // was input row [1]
    ['user_id' => 76, 'ac_type' => 1]  // was input row [2]
    ['user_id' => 82, 'ac_type' => 2], // was input row [3]
]

As a slight extension of requirements, if uniqueness must be determined based on more than one column, but not all columns, then use a "composite key" composed of the meaningful column values. The following uses the null coalescing assignment operator, but the other techniques from #2 and #3 can also be implemented.

Code: (Demo)

foreach ($array as $row) {
    $compositeKey = $row['user_id'] . '_' . $row['ac_type'];
    $result[$compositeKey] ??= $row;      // only store if first occurrence of compositeKey
}

Though I have never used it, the Ouzo Goodies library seems to have a uniqueBy() method that is relevant to this topic. See the unexplained snippet here.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
  • I'm not sure if this is what he wants. He wants to remove duplicates based on the user_id only. This will preserve the _last_ duplicate and remove all previous ones. I think he wants to preserve the first match and remove all duplicates after? – icecub Aug 10 '17 at 04:12
  • You are correct. If this makes a difference to the actual application, I'll need to rework it to retain the first occurrences. – mickmackusa Aug 10 '17 at 04:13
  • 1
    It's a great explanation. thanks [mickmackusa](https://stackoverflow.com/users/2943403/mickmackusa) – heySushil Jan 12 '23 at 05:43
3
$array = [
    ['user_id'=>82,'ac_type'=>1],
    ['user_id'=>80,'ac_type'=>5],
    ['user_id'=>76,'ac_type'=>1],
    ['user_id'=>82,'ac_type'=>2],
    ['user_id'=>80,'ac_type'=>6]
];

$array = array_reverse($array);

$v = array_reverse( 
    array_values( 
        array_combine( 
            array_column($array, 'user_id'),
            $array
        )
    )
);


echo '<pre>';
var_dump($v);

Result:

array(3) {
  [0]=>
  array(2) {
    ["user_id"]=>
    int(76)
    ["ac_type"]=>
    int(1)
  }
  [1]=>
  array(2) {
    ["user_id"]=>
    int(82)
    ["ac_type"]=>
    int(1)
  }
  [2]=>
  array(2) {
    ["user_id"]=>
    int(80)
    ["ac_type"]=>
    int(5)
  }
}
Thanh Nguyen
  • 5,174
  • 11
  • 43
  • 74
1

Took me a while, but this should work (explanation in comments):

<?php

/* Example array */
$result = array(
    0 => array(
        "user_id" => 82,
        "ac_type" => 1
        ),
    1 => array(
        "user_id" => 80,
        "ac_type" => 5
        ),
    2 => array(
        "user_id" => 76,
        "ac_type" => 1
        ),
    3 => array(
        "user_id" => 82,
        "ac_type" => 2
        ),
    4 => array(
        "user_id" => 80,
        "ac_type" => 2
        )
);

/* Function to get the keys of duplicate values */
function get_keys_for_duplicate_values($my_arr, $clean = false) {
    if ($clean) {
        return array_unique($my_arr);
    }

    $dups = $new_arr = array();
    foreach ($my_arr as $key => $val) {
      if (!isset($new_arr[$val])) {
         $new_arr[$val] = $key;
      } else {
        if (isset($dups[$val])) {
           $dups[$val][] = $key;
        } else {
           //$dups[$val] = array($key);
           $dups[] = $key;
           // Comment out the previous line, and uncomment the following line to
           // include the initial key in the dups array.
           // $dups[$val] = array($new_arr[$val], $key);
        }
      }
    }
    return $dups;
}

/* Create a new array with only the user_id values in it */
$userids = array_combine(array_keys($result), array_column($result, "user_id"));

/* Search for duplicate values in the newly created array and return their keys */
$dubs = get_keys_for_duplicate_values($userids);

/* Unset all the duplicate keys from the original array */
foreach($dubs as $key){
    unset($result[$key]);
}

/* Re-arrange the original array keys */
$result = array_values($result);

echo '<pre>';
print_r($result);
echo '</pre>';

?>

Function was taken from this the answer to this question: Get the keys for duplicate values in an array

Output:

Array
(
    [0] => Array
        (
            [user_id] => 82
            [ac_type] => 1
        )

    [1] => Array
        (
            [user_id] => 80
            [ac_type] => 5
        )

    [2] => Array
        (
            [user_id] => 76
            [ac_type] => 1
        )

)
icecub
  • 8,615
  • 6
  • 41
  • 70
1

Tested and working example.

<?php 

$details = array('0'=> array('user_id'=>'82', 'ac_type'=>'1'), '1'=> array('user_id'=>'80', 'ac_type'=>'5'), '2'=>array('user_id'=>'76', 'ac_type'=>'1'), '3'=>array('user_id'=>'82', 'ac_type'=>'1'), '4'=>array('user_id'=>'80', 'ac_type'=>'5'));

function unique_multidim_array($array, $key) { 
$temp_array = array(); 
$i = 0; 
$key_array = array(); 

foreach($array as $val) { 
    if (!in_array($val[$key], $key_array)) { 
        $key_array[$i] = $val[$key]; 
        $temp_array[$i] = $val; 
    } 
    $i++; 
    } 
  return $temp_array; 
 } 
?> 

<?php 
$details = unique_multidim_array($details,'user_id'); 
?> 

 <pre>

 <?php print_r($details); ?>

</pre> 

Will output:

Array
(
[0] => Array
    (
        [user_id] => 82
        [ac_type] => 1
    )

[1] => Array
    (
        [user_id] => 80
        [ac_type] => 5
    )

[2] => Array
    (
        [user_id] => 76
        [ac_type] => 1
    )
)

taken from here http://php.net/manual/en/function.array-unique.php in the user contributed notes.

Michael GEDION
  • 879
  • 8
  • 16