1
CODE1   CODE2   CODE3   RATE    VALUE   MONTH
A       B       C       1       1       202001
A       B       C       1       1       202002
A       B       C       1       1       202003
A       B       C       2       1       202004
A       B       C       2       1       202005
A       B       C       1       1       202006
A       B       C       1       1       202007
A       B       C       1       1       202008
A       B       C       1       1       202009

I am working on migrating data from old system to new system. As part of old system data maintained per month and will update the same row if data updated and table contains one row for a month I am migrating to news system and it contains start date and end date to make the active record . So on update new data needs to inserted and updated the old row end date

My expected data

CODE1   CODE2   CODE3   RATE    VALUE   START_DT    END_DT
A       B       C       1       1       20200101    20200331
A       B       C       2       1       20200401    20200531
A       B       C       1       1       20200601    99991230

If the data active and we will update the date as infinity so 999912

But I am getting only two records and my query is below

CODE1   CODE2   CODE3   RATE    VALUE   START_DT    END_DT
A       B       C       2       1       20200401    20200531
A       B       C       1       1       20200601    99991230


SELECT CODE1, CODE2, CODE3 RATE, VALUE,
 TO_DATE(MIN(bus_month), 'yyyymm') AS START_DT,
 last_day(TO_DATE(replace(MAX(bus_month), $CURRENTMONTG, '999912'), 'yyyymm')) AS end_date
FROM TEST_TABLE
GROUP BY CODE1, CODE2, CODE3, RATE, VALUE

SINCE I am grouping based on CODE1, CODE2, CODE3, RATE, VALUE and getting latest data based on grouping and I am not able to get the old data

Please help me to get the expected table structure. Thanks in advance

Please comment if any more details needed

GMB
  • 216,147
  • 25
  • 84
  • 135
Abdul
  • 942
  • 2
  • 14
  • 32

3 Answers3

1

This is a gaps-and-islands problem, where you want to group together "adjacent" rows that have the same rate and value.

One approach uses the difference between the row numbers to build the groups. Assuming that the three codes define the base group, and that you want to break into a new row whenever the rate or the value changes:

select code1, code2, code3, rate, value, min(month) start_dt, 
    case when row_number() over(partition by code1, code2, code3 order by max(month) desc) = 1 then 999912 else max(month) end end_dt
from (
    select t.*,
        row_number() over(partition by code1, code2, code3 order by month) rn1,
        row_number() over(partition by code1, code2, code3, rate, value order by month) rn2
    from mytable t
) t
group by code1, code2, code3, rate, value, rn1 - rn2
order by start_dt

The conditional expression in the outer query sets the end date of the "last" period to 999912.

Demo on DB Fiddle:

CODE1 | CODE2 | CODE3 | RATE | VALUE | START_DT | END_DT
:---- | :---- | :---- | ---: | ----: | -------: | -----:
A     | B     | C     |    1 |     1 |   202001 | 202003
A     | B     | C     |    2 |     1 |   202004 | 202005
A     | B     | C     |    1 |     1 |   202006 | 999912
GMB
  • 216,147
  • 25
  • 84
  • 135
1

You can use MATCH_RECOGNIZE to perform a row-by-row comparison of the data:

SELECT code1,
       code2,
       code3,
       rate,
       value,
       start_dt,
       CASE end_dt
       WHEN TO_NUMBER( TO_CHAR( SYSDATE, 'YYYYMM' ) )
       THEN 999912
       ELSE end_dt
       END AS end_dt
FROM   table_name
MATCH_RECOGNIZE (
   PARTITION BY code1, code2, code3
   ORDER BY     month
   MEASURES     FIRST( rate ) AS rate,
                FIRST( value ) AS value,
                FIRST( month ) AS start_dt,
                LAST( month ) AS end_dt
   ONE ROW PER MATCH
   PATTERN      (FIRST_ROW EQUAL_ROWS*)
   DEFINE       EQUAL_ROWS AS (
                      EQUAL_ROWS.rate  = PREV(EQUAL_ROWS.rate)
                  AND EQUAL_ROWS.value = PREV(EQUAL_ROWS.value)
                  AND TO_DATE( EQUAL_ROWS.month, 'YYYYMM' )
                        = ADD_MONTHS( TO_DATE( PREV(EQUAL_ROWS.month), 'YYYYMM' ), 1 )
                )
)

So, for your sample data:

CREATE TABLE table_name ( CODE1, CODE2, CODE3, RATE, VALUE, MONTH ) AS
SELECT 'A', 'B', 'C', 1, 1, 201912 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202001 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202002 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202003 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 2, 1, 202004 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 2, 1, 202005 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202006 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202007 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202008 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202009 FROM DUAL;

This outputs:

CODE1 | CODE2 | CODE3 | RATE | VALUE | START_DT | END_DT
:---- | :---- | :---- | ---: | ----: | -------: | -----:
A     | B     | C     |    1 |     1 |   201912 | 202003
A     | B     | C     |    2 |     1 |   202004 | 202005
A     | B     | C     |    1 |     1 |   202006 | 999912

db<>fiddle here

MT0
  • 143,790
  • 11
  • 59
  • 117
0

Oracle SQL:

SELECT
     code1,code2,code3,rate,value,min(MONTH) start_dt,
     CASE
          WHEN ROW_NUMBER() OVER(PARTITION BY code1, code2, code3 ORDER BYmax(MONTH) DESC) = 1 THEN 99991230
          ELSE max(MONTH)
     END end_dt
FROM
     (
     SELECT
          t.*,
          ROW_NUMBER() OVER(PARTITION BY code1, code2, code3 ORDER BY MONTH) rn1,
          ROW_NUMBER() OVER(PARTITION BY code1, code2, code3, rate, value ORDERBY MONTH) rn2
     FROM
          TBLTEST t
) t
GROUP BY
     code1,code2,code3,rate,value,rn1 - rn2
ORDER BY
     start_dt

It is rather simple to perform the task in the natural way of thinking. We compare neighboring values between rows on the first five columns, and put the current one and the previous row in the same group when values are same, or create a new group if they are different until the last record is compared. As SQL set is unordered, we need to first invent two columns of indexes manually in an extremely complicated way and then perform grouping according to the relationship between the two columns of indexes. You need to be really smart to come up with the solution.

Yet it is easy to write the code using the open-source esProc SPL:

  A
1 =connect("oracle")
2 =A1.query@x("SELECT * FROM TBLTEST ORDER BY MONTH")
3 =A2.groups@o(CODE1,CODE2,CODE3,RATE,VALUE;min(MONTH)/"01":STARTDT,string(date((max(MONTH)+1)/"01","yyyyMMdd")-1,"yyyyMMdd"):ENDDT)
4 >A3.m(-1).modify("99991230":ENDDT)

SPL supports ordered sets directly, making it easy to perform grouping when a neighboring value is different.

Li David
  • 1
  • 1