1

I am trying to replace some bad input records within a specific date range with correct records. However, I'm not sure if there is an efficient way to do so. Therefore my question is how to transform a (static) range of records to the values of the record after that range in SQL? Below you will find an example to clarify what I try to achieve.

enter image description here

In this example you can see that customer number 1 belongs to group number 0 (None) in the period from 25-06-2020 to 29-06-2020. From 30-06-2020 to 05-07-2020 this group number changes from 0 to 11 for customer number 1. This static period contains the wrong records, and should be changed to the values that are valid on 06-07-2020 (group number == 10). Is there a way to do this?

Yoorizz
  • 217
  • 2
  • 12

3 Answers3

1

I think that window function first_value() does what you want:

select 
    date,
    customer_number,
    first_value(group_number) over(partition by customer_number order by date) group_number,
    first_value(role)         over(partition by customer_number order by date) role
from mytable
GMB
  • 216,147
  • 25
  • 84
  • 135
  • This only works when the first value is valid for every record. However, this is not the case since values can change over time. Therefore the bad records should be replaced with the first valid record after this wrong set of records. To be specific, in this case the records between 29-06-2020 and 05-07-2020 should be replaced by the record of 06-07-2020. I adjusted the example with some more details. – Yoorizz Aug 04 '20 at 16:00
1

If I understand correctly, you can use window functions to get the data on that particular date and case logic to assign it to the specific date range:

select t.*,
       (case when date >= '2020-07-01' and date <= '2020-07-05'
             then max(case when date = '2020-07-06' then group_number end) over (partition by customer_number)
             else group_number
        end) as imputed_group_number,
       (case when date >= '2020-07-01' and date <= '2020-07-05'
             then max(case when date = '2020-07-06' then role end) over (partition by customer_number)
             else role
        end) as imputed_role
from t;

If you want to update the values, you can use JOIN:

update t
    set group_number = tt.group_number,
        role = tt.role
    from tt
    where tt.customer_number = t.customer_number and tt.date = '2020-07-06'
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

You can do the following as an example. Here i have choosen the criteria that if role='Leader' its a bad record and therefore you would be applying the next available group_number --> in column group_number1, and role1.

I have used a smaller subset of the rows you have in your excel example.

  select date1
        ,customer_number
        ,group_number
        ,case when role='Leader' then 
                  (select t1.group_number
                     from t t1
                    where t1.date1>t.date1
                      and t1.role<>'Leader'
                   order by t1.date1 asc
                   limit 1
                  ) 
             else group_number 
         end as group_number1
        ,role
       ,case when role='Leader' then 
                  (select t1.role
                     from t t1
                    where t1.date1>t.date1
                      and t1.role<>'Leader'
                   order by t1.date1 asc
                   limit 1
                  ) 
             else role 
         end as role1
   from t
order by 1   

+------------+-----------------+--------------+---------------+--------+--------+
|   DATE1    | CUSTOMER_NUMBER | GROUP_NUMBER | GROUP_NUMBER1 |  ROLE  | ROLE1  |
+------------+-----------------+--------------+---------------+--------+--------+
| 2020-06-25 |               1 |            0 |             0 | None   | None   |
| 2020-06-26 |               1 |            0 |             0 | None   | None   |
| 2020-06-27 |               1 |            0 |             0 | None   | None   |
| 2020-06-28 |               1 |            0 |             0 | None   | None   |
| 2020-06-29 |               1 |            0 |             0 | None   | None   |
| 2020-06-30 |               1 |           11 |            10 | Leader | Member |
| 2020-07-01 |               1 |           11 |            10 | Leader | Member |
| 2020-07-06 |               1 |           10 |            10 | Member | Member |
+------------+-----------------+--------------+---------------+--------+--------+

db fiddle link https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=c95d12ced067c1df94947848b5a94c14

George Joseph
  • 5,842
  • 10
  • 24