1

I have a task to merge overlapping time ranges in Redshift.

Here is a good solution: Merge overlapping time intervals, how?

Unfortunately, Redshift doesn't support Recursive CTE. How to transform this code to perform it without recursive?

WITH RECURSIVE cte( id, date_start, date_end ) AS
(
  SELECT id, date_start, date_end
  FROM evento
  UNION 
  SELECT e.id,
         least( c.date_start, e.date_start ),
         greatest( c.date_end, e.date_end )
  FROM cte c
  JOIN evento e
  ON e.date_start between c.date_start and c.date_end
     OR 
     e.date_end between c.date_start and c.date_end
)
SELECT distinct date_start, date_end
FROM (
  SELECT id, 
         min( date_start) date_start, 
         max( date_end ) date_end
  FROM cte
  GROUP BY id
) xx
ORDER BY date_start;

http://www.sqlfiddle.com/#!12/bdf7e/9

Denis
  • 11
  • 3

2 Answers2

1

You can compare dates with previous line dates and build segment ID using cumulative sum like this:

WITH
prev_dates as (
    select id, date_start, date_end,
    lag(date_end) over (order by date_start) as prev_date_end
    from evento
),
sequences as (
    select *,
    sum(case when date_start>prev_date_end then 1 else 0 end) over (order by date_start) as sequence_id
    from prev_dates   
)
select 
sequence_id,
min(date_start) as date_stat,
max(date_end) as date_end
from sequences
group by 1
AlexYes
  • 4,088
  • 2
  • 15
  • 23
  • Thanks for your answer, Alex. Sorry, I missed in my question that I have Type column with 'Online' or 'Busy' status. So, there are can be 1 or more Busy sessions inside one Online session, and we should compare date_start with prev Online-date-end. How to consider this in window function ? – Denis Jan 04 '18 at 21:05
  • are "busy" sessions exclusively inside "online" sessions and you need to determine sequences of the "online" sessions? – AlexYes Jan 04 '18 at 21:25
  • some "busy" sessions are outside the "online". It's a mistake in data, and I want to fill the gap of "online" time, where "busy" occurs. – Denis Jan 04 '18 at 22:15
  • @Dmitros if "busy" sessions are chained you can build sequences of "busy" sessions only by filtering records by status, then union these sequences to "online" sessions and do the same operation of sequencing... makes sense? – AlexYes Jan 05 '18 at 15:04