0

I thought using CTE and TD_NORMALIZE_MEET would help me out. I have multiple rows, say

**FIELD1    |   FIELD2  |    FIELD3 |    FIELD4 |    START_DT   |    END_DT**
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2019/09/23 |    2019/09/30
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2019/10/01 |    2019/10/25
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2019/11/01 |    2019/11/30
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2019/12/01 |    2019/12/25
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2019/12/26 |    2020/01/10
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2020/01/15 |    2020/01/30
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2020/01/31 |    2020/03/20
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2020/03/28 |    2020/05/25
abcd1   |    efgh1  |    hijk1  |     lmno1 |    2020/06/01 |    2020/09/01

These should come up as

**FIELD1    |   FIELD2  |   FIELD3  |   FIELD4  |   START_DT    |   END_DT**
abcd1   |   efgh1   |   hijk1   |   lmno1   |   2019/09/23  |   2019/10/25
abcd1   |   efgh1   |   hijk1   |   lmno1   |   2019/11/01  |   2020/01/10
abcd1   |   efgh1   |   hijk1   |   lmno1   |   2020/01/15  |   2020/03/20
abcd1   |   efgh1   |   hijk1   |   lmno1   |   2020/03/28  |   2020/05/25
abcd1   |   efgh1   |   hijk1   |   lmno1   |   2020/06/01  |   2020/09/01

However, TD_NORMALIZE_MEET does not help out. Tried TD_NORMALIZE_OVERLAP_MEET and all other tricks. Any help is greatly appreciated. I even tried using the TD_NORMALIZE_OVERLAP_MEET by adding a day to the end date but that messed up the data differently. So I have to adhere to the dates but merge the dates in a way they are combined to have the Min start date of the lowest consecutive period and the max end date of the highest consecutive period.

Fred
  • 1,916
  • 1
  • 8
  • 16
AK2016
  • 1
  • 1
  • Can you show what you are doing that isn't working for you? `TD_NORMALIZE_MEET` or just `SELECT NORMALIZE` should be able to handle this, but you do need to create a `PERIOD` data type and normalize in a subquery, then extract `BEGIN` and `END` values in an outer query. – Fred Sep 23 '20 at 16:00
  • Fred, thanks for replying. Yes I noticed that even for NORMALIZE_MEET or even SELECT NORMALIZE, this did not work. I had to "extend" the end date by a day when creating the PERIOD field. Also, I somehow had to play around with the options to see what stuck. Showing my answer below. Still validating but seems to work so far. – AK2016 Sep 25 '20 at 02:36

2 Answers2

0

As you mention, you will have to manipulate the END_DT since the NORMALIZE options require a PERIOD data type, and PERIOD semantics are "up to, but not including" the ending datetime value.

select field1, field2, field3, field4, begin(pd) as Start_DT, prior(end(pd)) as End_DT
from (
  select normalize field1, field2, field3, field4, period(Start_DT,next(End_DT)) as pd
  from My_Table) Norm_Tbl
Fred
  • 1,916
  • 1
  • 8
  • 16
  • I tried Normalize and it did not work either. As I stated, the levels of overlaps wasn't fixed. – AK2016 Sep 25 '20 at 03:03
  • I don't understand what you mean by "levels of overlaps". This query works for the data given; perhaps the sample data is not truly representative of the real data. Maybe you need `NORMALIZE ON MEETS OR OVERLAPS` instead of just `NORMALIZE` (implying only ON MEETS), or maybe it's something else entirely. – Fred Sep 25 '20 at 17:00
  • Fred, actually, I didn't follow your entire query. I did try Normalize but it did not give the answer as I wanted. I will try with a smaller subset and see if this helps. I tried the option given below and it worked fine. – AK2016 Sep 28 '20 at 03:57
0

So in continuation of my problem and thanks to some wonderful tips by folks who are also here, I had to be creative. One was the use of CTE (Common Table Expression) and the other was creating a distinct ID field for the group of 4 fields I have since somehow passing anything non integer and more than one field was not working for TD_NORMALIZE_OVERLAP_MEET that I used.

Am still looking for a solution to use multiple group columns, non INT types with the TD_NORMALIZE... functions. Please assist.

As Fred suggested above, I did have to extend the end date to overlap the next start date if it was adjacent. I used DENSE_RANK so that the numbers are continuous from one group set to another.

so

CREATE VOLATILE TABLE TBL1 AS
(
    SELECT 
        FIELD1, 
        FIELD2, 
        FIELD3, 
        FIELD4, 
        DENSE_RANK() OVER(PARTITION BY FIELD1, FIELD2, FIELD3, FIELD4 
                            ORDER BY START_DT ASC) AS GRP_RANK, 
        START_DT, 
        END_DT, 
        PERIOD(START_DT, END_DT + 1) AS COVERED_PERIOD
    FROM MYTABLE
) WITH DATA
PRIMARY INDEX (FIELD1, FIELD2, FIELD3, FIELD4, SAME_DAY_FLAG)
INDEX (GRP_RANK) ON COMMIT PRESERVE ROWS;

Used this to create Merged table.

CREATE VOLATILE TABLE TBL_MERGED AS
( 
    WITH CTE(GRP_RANK, COVERED_PERIOD) AS
    (
        SELECT 
            GRP_RANK, 
            COVERED_PERIOD 
        FROM TBL1 
    )
    SELECT * FROM TABLE
    (   TD_SYSFNLIB.TD_NORMALIZE_OVERLAP_MEET(NEW VARIANT_TYPE(CTE.GRP_RANK), CTE.COVERED_PERIOD)
        RETURNS(GRP_RANK INT, COVERED_PERIOD PERIOD(DATE))
        HASH BY GRP_RANK
        LOCAL ORDER BY GRP_RANK, COVERED_PERIOD
    ) AS TBL_ROWS
)WITH DATA ON COMMIT PRESERVE ROWS;

Then I joined it with my tbl1

SELECT 
        A.FIELD1, 
        A.FIELD2, 
        A.FIELD3, 
        A.FIELD4, 
        A.GRP_RANK, 
        A.COVERED_PERIOD, 
        MIN(A.START_DT) AS START_DT, 
        MAX(A.END_DT) AS END_DT
FROM    TBL1 AS A
JOIN    TBL_MERGED AS B
ON      A.GRP_RANK = B.GRP_RANK
AND     A.COVERED_PERIOD OVERLAPS B.COVERED_PERIOD
GROUP BY 1,2,3,4,5,6
ORDER BY 1,2,3,4,7 
AK2016
  • 1
  • 1
  • Wouldn't you get the same answer just applying the same GROUP BY to TBL1? Your final query doesn't use any of the data from TBL_MERGED. – Fred Sep 25 '20 at 16:55
  • Fred, check the final query. TBL_MERHED used as alias B joining to pre CTE table by group rank, getting the min and max dates where the dates overlap the period – AK2016 Sep 27 '20 at 05:21