0

The following script will remove duplicates from an array based on a single key. I found it via the following reference. Reference: remove duplicates from array (array unic by key)

The problem I have it that the $initial_data array may contain items with the same [Post_Date] values but different [Item_Title] values.

Is there a way to modify the code such that it only removes duplicates if both the [Post_Date] and [Item_Title] values are identicle?

 // Remove Duplicates based on 'Post_Date'
    $_data = array();
    foreach ($initial_data as $v) {
      if (isset($_data[$v['Post_Date']])) {
        continue;
      } 
      $_data[$v['Post_Date']] = $v;
    }
    // if you need a zero-based array, otherwise work with $_data
       $unique_results = array_values($_data);

Below is a simplified output of the arrays showing 4 fields. The original arrays contain 16 fields.

$initial_data: Original Data Array. The [Post_Date] values are the same but the [Item_Title] values are different.

Array
(
    [0] => Array
        (
            [id] => 22000
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Painting
        )

    [1] => Array
        (
            [id] => 22102
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Repair

        )
...
)

$_data: The $_data array from within the script

Array
(
    [1356373690] => Array
        (
            [id] => 22000
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Painting
        )

    [1356373690] => Array
        (
            [id] => 22102
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Repair

        )
...
)

$unique_results: The final unique results array. As you can see the duplicate array item was removed by the script based on the [Post_Date] alone, but I need it to also evaluate if the [Item_Title] values are different or identical so that it will not consider this array item a duplicate.

Array
(
    [0] => Array
        (
            [id] => 22000
            [Category] => vehicles
            [Post_Date] => 1356373690
            [Item_Title] => Car Painting
        )
...
)
Community
  • 1
  • 1
Sammy
  • 877
  • 1
  • 10
  • 23
  • @Mike Brant I have tried nothing else. The code above works well for 1 key but I need it modified for 2 keys. I also tried the two suggestions below and both did not work. – Sammy Dec 25 '12 at 00:16

2 Answers2

1

The easiest way, I suppose, is using simple concatenation of these two properties as a key for this $data hash:

$key = $v['Post_Date'] . $v['Item_Title'];
if (isset($_data[$key])) {
  continue;
} 
$_data[$key] = $v;

It obviously won't work if Post_Date and Item_Title can 'overlap' - but it seems not to be possible from the given sample. To prevent this, you can insert a separator symbol in that $key, like this:

$key = $v['Post_Date'] . ':' . $v['Item_Title'];

... as colon symbol obviously won't be used to store a timestamp string.

raina77ow
  • 103,633
  • 15
  • 192
  • 229
0

You could solve this with nested loop

$uniqueData = array();
foreach ($initialData as $item) {
    $exists = false;

    // check if same item was already added to uniqueData array
    foreach ($uniqueData as $uniqueItem)
        if($item['postDate'] == $uniqueItem['postDate'] && $item['itemTitle'] == $uniqueItem['itemTitle'])
            $exists = true;

    // there is no same item in uniqueData array
    if(!$exists)
        $uniqueData[] = $item;
}

print_r($uniqueData);

As a side note, in most cases it's best to avoid using continue statement, as it will make your code harder to read.

Žan Kusterle
  • 572
  • 1
  • 10
  • 28
  • This is inefficient, to say the least. First, you skip the possibility of using hash function for the fast lookup of existing items. Second, you didn't break the loop when the item is found (so each search will go through the whole `$uniqueData` array again and again). Finally, your statement on `continue` is... weird, to say the least: there's nothing wrong or 'unreadable' with this op by default, it all depends on how it's used. – raina77ow Dec 25 '12 at 00:11
  • @kustrle Itried it and it took about 30 seconds to process ending up with an empty array Array(). I substituted your variables $initialData with my array $results, your postDate with my Post_Date, and your itemTitle with my Item_Title in your code. – Sammy Dec 25 '12 at 00:13
  • I ran it with 4 items and it works fine and without delay. @raina77ow He didn't ask for efficient solution. Premature optimization is the root of all evil. If he doesn't have lots of items the code will run just fine. About continue statement I can use your answer as an example. Writing if (!isset($_data[$key])) $_data[$key] = $v; looks much more cleaner than with continue. – Žan Kusterle Dec 25 '12 at 00:25
  • Here's a demo http://sandbox.onlinephpfunctions.com/code/18efa3706be8efce25ac95ecc6ea455a18ffec52 – Žan Kusterle Dec 25 '12 at 00:33