0

I am trying to remove duplicates based on 'duplicate_check' for the following array. It seems neither array_unique nor super_unique function works. I also tried to compare two identical arrays with a loop inside a loop function, but it runs out of time because there are tens of thousands lines in the array. Any help?

[1] => Array
    (
        [a] => abc
        [b] => 202
        [c] => 001
        [d] => 
        [e] => Graphic Commun
        [duplicate_check] => abc202001
    )

[2] => Array
    (
        [a] => abc
        [b] => 211
        [c] => 001
        [d] => Bard
        [e] => CAD Fundamentals
        [duplicate_check] => abc211001
    )
 [3] => Array
    (
        [a] => abc
        [b] => 211
        [c] => 001
        [d] => 
        [e] => 
        [duplicate_check] => abc211001
    )
Rui Xia
  • 175
  • 4
  • 20

1 Answers1

0

Well, I don't know about your tried approach (you should add that to your question). But it seems you should just use a loop to filter entries:

$found = array();
foreach ($array as $i=>$row) {

    $check = "$row[a],$row[b],$row[c]";
    //$check = $row["duplicate_check"];

    if (@$found[$check]++) {
        unset($array[$i]);
    }
}

A lazy solution (but probably not to your task) could also be:

=array_map("unserialize", array_unique(array_map("serialize", $array)));
mario
  • 144,265
  • 20
  • 237
  • 291
  • @mario I tried the first one but no luck. I did use the second solution with super_unique function from php.net/manual/en/function.array-unique.php. it does remove some duplicates but not the one i indicated above. – Rui Xia Mar 20 '11 at 05:47
  • You need to explain the structure of your source array more detailed. For example it is unclear if `"duplicate_check"` is already part of said array, or added by that super_unique (which showing would clear things up too). Show an example for which above approach failed. Else explain the filter conditions. – mario Mar 20 '11 at 05:50
  • @mario the 'duplicate_check' is already part of said array. Then, i have $arr = super_unique($arr). It didn't remove the duplicates I indicated above. I am not sure it's because the value is not completely identical. If you look at [2] and [3]. The value of 'duplicate_check' is the same but [2] has value in 'd' and 'e'. – Rui Xia Mar 20 '11 at 05:58
  • @mario in the indicated case, removing either [2] or [3] is what i want. – Rui Xia Mar 20 '11 at 05:59
  • Then your problem is that `"duplicate_check"` does not cover all fields. So the uniqueness of `a+b+c+d+e` concatenated together is what you want to filter after? – mario Mar 20 '11 at 06:06
  • @mario "duplicate_check" only covers [a], [b], [c] and I want to filter a+b+c concatenated. – Rui Xia Mar 20 '11 at 17:23
  • Almost thought so. I've changed the code above to recalculate the `duplicate_check`. Maybe that's working for you. (I suspect the precalculated entries are invalid.) – mario Mar 20 '11 at 17:26