I have a set of records that consists of the start and stop dates, as the following:
ID | started | stop |
---|---|---|
1 | 2017-08-14 | 2017-10-22 |
2 | 2017-10-23 | 2017-12-12 |
3 | 2019-01-28 | 2019-02-21-> |
4 | Some-Date | NULL |
5 | 2020-09-08 | 2020-09-14 |
6 | 2020-09-15 | 2020-10-14 |
7 | ->2019-02-22 | 2019-03-18 |
I need to merge those sequence of dates that come in a chronological order as heighlighted which follow the rule (stop = start + one day).
The result should look like this:
ID | started | stop |
---|---|---|
1 | 2017-08-14 | 2019-03-18 |
2 | Some-Date | NULL |
3 | 2020-09-08 | 2020-10-14 |
My approach: creates 3 new colums and check each entry x with its previous entry (x - 1), and with its next one (x + 1), and if any of those is in between -1 , +1 then group them:
update date_table as t
set group_num_before = (select date_difference
from (SELECT g1.id, DATEDIFF(g1.start_dt, g2.stop_dt) AS date_difference
FROM date_table g1
INNER JOIN
date_table g2 ON g2.id = g1.id - 1) as groupNum
where t.id = groupNum.id),
group_num_after = (select date_difference
from (SELECT g1.id, DATEDIFF(g1.stop_dt, g2.start_dt) AS date_difference
FROM date_table g1
INNER JOIN
date_table g2 ON g2.id = g1.id + 1) as groupNum
where t.id = groupNum.id)
where true;
update date_table as g1
INNER JOIN
date_table g2 ON g2.id = g1.id
set g1.group_num = IF(g1.group_num_before in (-1, 1) OR
g1.group_num_after in (-1, 1), -1, g1.group_num_before)
where g1.A = g2.B;
This result in:
ID | started | stop | groupNum |
---|---|---|---|
1 | 2017-08-14 | 2017-10-22 | -1 |
2 | 2017-10-23 | 2017-12-12 | -1 |
3 | 2017-12-13 | 2019-02-21-> | -1 |
4 | Some-Date | NULL | null |
5 | 2020-09-08 | 2020-09-14 | -1 |
6 | 2020-09-15 | 2020-10-14 | -1 |
7 | ->2019-02-22 | 2019-03-18 | -1 |
However, doing this put all the sequenced records in the same group since they also come in a chronological order, and this lead to incorrect result when grouping them together.
Any idea will be appreciated Thanks in advance.