0

We have a feature table:

cust_no, cust_loc, feature, toggle, last_udpated_dt
123      ABC       DELIVER  Y       2019-01-01
123      ABC       DELIVER  Y       2019-01-03
123      ABC       DELIVER  N       2019-05-01
123      ABC       KEYS     N       2019-01-01
123      DEF       KEYS     N       2019-01-01
123      GHI       KEYS     N       2019-01-01
123      GHI       KEYS     Y       2019-02-01
123      GHI       KEYS     Y       2019-02-04
123      GHI       KEYS     N       2019-04-01
123      GHI       KEYS     Y       2019-06-01

There is a back end system that sets the toggle for different features for different customer/locations Y/N. The valid value is the value with the greatest last_updated_dt;

I should add, we often get toggles 'reset' to their current value, so if someone comes in and turns all features/locations off for a customer, the flags will be set for all feature locations, even those that had none set, and those that were off;

Now I am being asked to evaluate performance over time, and I need to know when features turned on and off.

I think this is best performed by creating a table like this:

cust_no, cust_loc, feature, effective_dt, expiration_dt
123      ABC       DELIVER  2019-01-01    2019-05-01
123      GHI       KEYS     2019-02-01    2019-04-01
123      GHI       KEYS     2019-06-01    9999-12-31

We can ignore the KEYS flag from ABC and DEF, because they were never turned on.

that would have a seed row with the min last_updated_dt:

create table tmp_01 as
select cust_no, cust_loc, feature, min(last_updated_dt) as eff_dt
from feature
where toggle = 'Y'
group by cust_no, cust_loc, feature;

I then create a copy of the feature table that does not include any of those rows (or any N values before the first Y)

create table tmp_02 as
select cust_no, cust_loc, feature, toggle, last_update_dt
from feature as f
, tmp_01 as a
where a.cust_no = f.cust_no 
and a.cust_loc = f.cust_loc 
and a.feature = f.feature 
and f.last_updated_dt <= eff_dt;

I can then create expiration for each row at the min last_udated_dt where toggle = N, create a new table without those rows

create table tmp_03 as
with f as (select cust_no, cust_loc, feature, min(last_updated_dt) as exp_dt
   from tmp_02
   where toggle = 'N'
   group by cust_no, cust_loc, feature
  )
select a.cust_no, a.cust_loc, a.feature, a.eff_dt
, min(coalesce(f.last_updated_dt,'9999-12-31')) as exp_dt
from tmp_01 as a
left outer join f
on (a.cust_no = f.cust_no 
   and a.cust_loc = f.cust_loc 
   and a.feature = f.feature) 
 ;

Create a copy of the feature (tmp_02) table removing the rows already used:

create table tmp_04 as
select cust_no, cust_loc, feature, toggle, last_update_dt
from tmp_02 as f
, tmp_03 as a
where a.cust_no = f.cust_no 
and a.cust_loc = f.cust_loc 
and a.feature = f.feature 
and f.last_updated_dt <= exp_dt; 

Now tmp_03 has the data I want, but only the first on/off cycle; I am going to have to repeat this for every on/off cycle, until there are no more rows left.

It seems there should be a way to do this in one cycle? I just don't see it.

markwusinich
  • 108
  • 7
  • Possible duplicate of [Group consecutive rows of same value using time spans](https://stackoverflow.com/questions/24244659/group-consecutive-rows-of-same-value-using-time-spans) – Serg Jul 10 '19 at 14:14

2 Answers2

0

I think lead() does what you want:

select t.*,
       last_updated_dt as eff_dt,
       lead(last_updated_dt) over (partition by cust_no, cust_loc, feature order by last_updated_dt) as end_dt
from t;

This should give you the tiling that you want.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

Just like so many problems, thinking about it, then sleeping on it helped.

Here is what I came up with. It takes five steps, but will work no matter how many time a flag is toggled on or off.

In English first:

1, We need to remove the redundant rows, where a flag is reset to the value it already has.

2, We then remove any row if the first action is setting the flag to N

3, We add row numbers, add a group by field, so that each pair of Y/N can be converted to eff_date, exp_date.

Now the code:

create table mww_001 as
select cust_no, cust_loc, feature, toggle, last_updated_date
, lag(toggle_value) over (partition by cust_no, cust_loc, feature 
                          order by last_updated_date) as lag_toggle
from feature_table;

create table mww_002 as 
select cust_no, cust_loc, feature, toggle, last_updated_date, lag_toggle
, row_number() over (partition by cust_no, cust_loc, feature
                     order by last_updated_date) as rownbr_01
from mww_001
where lag_toggle is null or lag_toggle <> toggle;

create table mww_003 as
select cust_no, cust_loc, feature, toggle, last_updated_date
, row_number() over (partition by cust_no, cust_loc, feature
                     order by last_updated_date) as rownbr_02
from mww_002
where not(rownbr_01 = 1 and toggle = 'N') ;

create table mww_003 as
select cust_no, cust_loc, feature, toggle, last_updated_date
, ceiling(rownbr_02/2) as group_01
from mww_002 ;

create table feature_new as
select cust_no, cust_loc, feature, group_01
, max(case when toggle = 'Y' then last_updated_date else null end) as eff_dt
, coalesce(max(case when toggle = 'N' 
                    then last_updated_date 
                    else null end)
           , '9999-12-31') as exp_dt
from mww_003 
group by cust_no, cust_loc, feature, group_01 ;
markwusinich
  • 108
  • 7