5

I am trying to find the best way on how to merge date ranges into one database record (array element).

This is the data I have:

  Array
(
    [0] => Array
        (
            [id] => 18298
            [start_date] => 2011-07-09
            [end_date] => 2011-10-01
        )

    [1] => Array
        (
            [id] => 18297
            [start_date] => 2011-06-01
            [end_date] => 2011-06-30
        )

    [2] => Array
        (
            [id] => 17113
            [start_date] => 2011-03-31
            [end_date] => 2011-05-31
        )

    [3] => Array
        (
            [id] => 20555
            [start_date] => 2011-01-03
            [end_date] => 2011-03-31
        )
)

And after we combine them, array (or database) should look like this:

Array
(
    [0] => Array
        (
            [merged_ids] => 18298
            [start_date] => 2011-07-09
            [end_date] => 2011-10-01
        )

    [1] => Array
        (
            [merged_ids] => 18297, 17113, 20555
            [start_date] => 2011-01-03
            [end_date] => 2011-06-30
        )
)

Is there any algorithm to go through all elements/ranges and combine them? Which way is better/easier to do - through database (MYSQL) or coding (PHP)?

Any advise is highly appreciated.

Thanks!

UPDATE: Sorry, I didn't provide enough info: we should merge any continuous and overlapping date ranges.

Kelvin
  • 8,813
  • 11
  • 38
  • 36
  • 1
    How do you determine which dates should be merged? Is it if they are continuous? – Argote Feb 11 '11 at 18:19
  • is it guaranteed that the original date ranges won't overlap? – Alnitak Feb 11 '11 at 18:20
  • Aren't they overlapping in the example data he gave? `20555` and `17113` are overlapping. Also, does the `merged_ids` property have any sort order for its elements? – Argote Feb 11 '11 at 18:22
  • @Alnitak & @Argote, we should merge any continuous and overlapping date ranges – Kelvin Feb 11 '11 at 18:29

4 Answers4

14

Sort by start date.

Then iterate through and check for if the next item's start date is before or directly after the current one's end date. If it is, then merge the next one into the current one. Then continue.

Amber
  • 507,862
  • 82
  • 626
  • 550
  • Hello @Amber, Thanks for your answer. I forgot to mention that date ranges may also overlap. These should be merged too. – Kelvin Feb 11 '11 at 18:30
  • @Kelvin - that's trivial enough - just also check and see if the *end* date for the next date range is before the end date of the current one. – Amber Feb 11 '11 at 18:42
  • @Kelvin - actually, you don't even need to do that - since if the end date is before the current one's end date, the start date is also definitely before the current one's end date, so the original check I described will catch it as well. – Amber Feb 11 '11 at 18:43
  • here is the PHP code for that logic: http://mdb-blog.blogspot.com/2020/12/php-merge-dateranges-algorithm.html – mr.baby123 Dec 03 '20 at 11:37
  • PHP code for this logic of merge - found here: http://mdb-blog.blogspot.com/2020/12/php-merge-dateranges-algorithm.html – mr.baby123 Dec 03 '20 at 11:39
1

I've written function which combines/merges list of ranges. It is written in Python, but it should be easy to rewrite it in PHP. Here's full code: https://gist.github.com/barszczmm/8447665 and here's simplified algorithm (still in Python):

list_of_ranges.sort() # sort input list
new_list_of_ranges = [] # output list

new_range_item_start = None
new_range_item_end = None

length = len(list_of_ranges)
for i, range_item in enumerate(list_of_ranges):
    if new_range_item_start is None:
        new_range_item_start = range_item[0]
        new_range_item_end = range_item[1]
    elif new_range_item_end >= range_item[0]:
        new_range_item_end = max(range_item[1], new_range_item_end)
    else:
        new_list_of_ranges.append((new_range_item_start, new_range_item_end))
        new_range_item_start = range_item[0]
        new_range_item_end = range_item[1]
    # save if this is last item
    if i + 1 == length:
        new_list_of_ranges.append((new_range_item_start, new_range_item_end))
barszczmm
  • 341
  • 3
  • 5
0

The implementation is like:

function mergeDateTimeRanges($ranges)
{
    $retVal = [];
    //sort date ranges by begin time
    usort($ranges, function ($a, $b) {
        return strcmp($a['begin_at'], $b['begin_at']);
    });

    $currentRange = [];
    foreach ($ranges as $range) {
        // bypass invalid value
        if ($range['begin_at'] >= $range['end_at']) {
            continue;
        }
        //fill in the first element
        if (empty($currentRange)) {
            $currentRange = $range;
            continue;
        }

        if ($currentRange['end_at'] < $range['begin_at']) {
            $retVal[] = $currentRange;
            $currentRange = $range;
        } elseif ($currentRange['end_at'] < $range['end_at']) {
            $currentRange ['end_at'] = $range['end_at'];
        }
    }

    if ($currentRange) {
        $retVal[] = $currentRange;
    }

    return $retVal;
}
Bo Lu
  • 9
  • 2
0

My method will generate a merged array where by overlapping or consecutive dates are grouped together and the group's ids are stored as a string of comma-space separated values.

I am using the modern "spaceship operator" (<=>) for usort()s comparison. If your code is running on a version below php7, you can use:

usort($array,function($a,$b){ return strcmp($a['start_date'],$b['start_date']); });

See inline comments for step-by-step explanation.

Code: (Demo)

function mergeRanges($array){
    usort($array,function($a,$b){ return $a['start_date']<=>$b['start_date']; });  // order by start_date ASC

    foreach($array as $i=>$row){
         if($i && $row['start_date']<=date('Y-m-d',strtotime("{$result[$x]['end_date']} +1 day"))){  // not the first iteration and dates are within current group's range
            if($row['end_date']>$result[$x]['end_date']){  // only if current end_date is greater than existing end_date
                $result[$x]['end_date']=$row['end_date'];  // overwrite end_date with new end_date in group
            }
            $result[$x]['merged_ids'][]=$row['id'];  // append id to merged_ids subarray
        }else{  // first iteration or out of range; start new group
            if($i){  // if not first iteration
                $result[$x]['merged_ids']=implode(', ',$result[$x]['merged_ids']);  // convert previous group's id elements to csv string
            }else{  // first iteration
                $x=-1;  // declare $x as -1 so that it becomes 0 when incremented with ++$x
            }
            $result[++$x]=['merged_ids'=>[$row['id']],'start_date'=>$row['start_date'],'end_date'=>$row['end_date']]; // declare new group
        }
    }
    $result[$x]['merged_ids']=implode(', ',$result[$x]['merged_ids']);  // convert final merged_ids subarray to csv string
    return $result;
}

$array=[
    ['id'=>18298,'start_date'=>'2011-07-09','end_date'=>'2011-10-01'],
    ['id'=>18297,'start_date'=>'2011-06-01','end_date'=>'2011-06-30'],
    ['id'=>17113,'start_date'=>'2011-03-31','end_date'=>'2011-05-31'],  // tests that 17113 and 18297 belong in same group
    ['id'=>20556,'start_date'=>'2011-02-03','end_date'=>'2011-02-13'],  // tests that "fully overlapped" date range is included
    ['id'=>20555,'start_date'=>'2011-01-03','end_date'=>'2011-03-31']
];

print_r(mergeRanges($array));

Output:

Array
(
    [0] => Array
        (
            [merged_ids] => 20555, 20556, 17113, 18297
            [start_date] => 2011-01-03
            [end_date] => 2011-06-30
        )

    [1] => Array
        (
            [merged_ids] => 18298
            [start_date] => 2011-07-09
            [end_date] => 2011-10-01
        )

)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136