0

Good morning,

Every 1st month I collect data for my data warehouse. Now I want to determine what the consecutive period over the months is with oracle sql.

If there is an interruption in the month, I want to start a new period.

The example I have is similar to the one below:

ID  LOAD_DATE  
100 20190101  
100 20190201  
100 20190401  
100 20190501  
100 20190601  
100 20190701  
100 20191001  
100 20191101  
100 20191201  
100 20200101  
200 20190701  
200 20190901  
200 20191101  
200 20191201  
200 20200101  
200 20200201  

The desired outcome:

ID  From     To  
100 20190101 20190201  
100 20190401 20190701  
100 20191001 20200101  
200 20190701 20190701  
200 20190901 20190901  
200 20191101 20200201  

I can manage the continuous period based on one year. Due to the changes in the year, I am unable to do this by month.

Please help. Im using oracle sql developer

Junjie
  • 491
  • 1
  • 6
  • 13
Marcel
  • 3
  • 2

4 Answers4

0

For actual supported versions (ie Oracle 12+):

select * 
from t
match_recognize(
  partition by id
  order by load_date
  measures
     first(LOAD_DATE) as ld,
     LAST(LOAD_DATE) AS lst,
     count(*) as cnt
  pattern (strt next*)
  define
     next as next.load_date=add_months(prev(load_date),1)
)
order by 1,2;

Full example with sample data:

alter session set nls_date_format='yyyymmdd';
with t (ID,LOAD_DATE) as (
   select 100, to_date('20190101') from dual union all  
   select 100, to_date('20190201') from dual union all  
   select 100, to_date('20190401') from dual union all  
   select 100, to_date('20190501') from dual union all  
   select 100, to_date('20190601') from dual union all  
   select 100, to_date('20190701') from dual union all  
   select 100, to_date('20191001') from dual union all  
   select 100, to_date('20191101') from dual union all  
   select 100, to_date('20191201') from dual union all  
   select 100, to_date('20200101') from dual union all  
   select 200, to_date('20190701') from dual union all  
   select 200, to_date('20190901') from dual union all  
   select 200, to_date('20191101') from dual union all  
   select 200, to_date('20191201') from dual union all  
   select 200, to_date('20200101') from dual union all  
   select 200, to_date('20200201') from dual
)
select * 
from t
match_recognize(
  partition by id
  order by load_date
  measures
     first(LOAD_DATE) as ld,
     LAST(LOAD_DATE) AS lst,
     count(*) as cnt
  pattern (strt next*)
  define
     next as next.load_date=add_months(prev(load_date),1)
)
order by 1,2;
Sayan Malakshinov
  • 8,492
  • 1
  • 19
  • 27
0

I got it partially, working on remaining part.

Demo

SELECT X.ID, X.LOAD_DATE, X.NEXT_MONTH FROM (
SELECT CASE 
WHEN TRUNC(ADD_MONTHS(TO_DATE(LOAD_DATE, 'YYYY-MM-DD'), 1)) = TRUNC(TO_DATE(NEXT_MONTH, 'YYYY-MM-DD')) 
THEN 1 ELSE 0 END AS CHECK_VALUE, A.* FROM
(
SELECT A.*, LEAD(LOAD_DATE) OVER(PARTITION BY ID ORDER BY LOAD_DATE) NEXT_MONTH
FROM TABLE1 A ) A) X WHERE CHECK_VALUE = 1;
Atif
  • 2,011
  • 9
  • 23
0

One solution makes use of window functions, with the actual dates stored as numbers, like so:

select id, min(load_date) "To", max(load_date) "From"
from (
  select id, load_date,
    sum(period_start) over (partition by id order by load_date_converted)
      period
  from (
    select id, load_date,
      to_date(load_date,'YYYYMMDD') load_date_converted,
      case when add_months(to_date(load_date,'YYYYMMDD'),-1)<>
        lag (to_date(load_date,'YYYYMMDD'),1,sysdate)
          over (partition by id order by load_date)
            then 1 else 0
      end period_start
    from table_name
  )
)
group by id, period
order by id, period

Oracle Version: 12c

Junjie
  • 491
  • 1
  • 6
  • 13
  • Wow, finaly it works. I've been messing around with this for a long time. Thanks. I can finally finish my script. – Marcel Aug 19 '20 at 07:30
0

The simplest approach is to make a simple observation: If you have a sequence of numbers enumerating each row, then the difference -- in months -- between this sequence and load date is constant when the months are consecutive. You can then aggregate by this difference:

select id, min(load_date), max(load_date)
from (select t1.*, row_number() over (partition by id order by load_date) as seqnum
      from table1 t1
     ) t1
group by id, add_months(to_date(load_date, 'YYYYMMDD'), - seqnum)
order by id, min(load_date);

Here is a db<>fiddle.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786