0

i have a table with billions of records which is holding Merchandise information as follows. ID is bigint with auto increment. Article, UPC and MRP is actual data. DataDate is having the information like from which date this MRP is applicable.

ID        Article            EANUPC         MRP        DataDate        
8546417   20171554001        1220636        599        20/11/2015        
18589213  20171554001        1220636        599        15/12/2017        
18655485  20171554001        1220636        390        26/12/2017        
18784953  20171554001        1220636        390        11/1/2018        
18833697  20171554001        1220636        290        16/1/2018        
18954190  20171554001        1220636        290        24/1/2018        
19060047  20171554001        1220636        190        30/1/2018        
19116702  20171554001        1220636        90        6/2/2018        
20107113  20171554001        1220636        90        13/6/2018        
20143100  20171554001        1220636        599        13/6/2018

i want to merge the records in such a way that if MRP is same for subsequent days. then it should show valid from and Valid To date for that MRP.

i require following output.

Article            EANUPC        MRP        FromDate        ToDate
20171554001        1220636        599        20/11/2015     25/12/2017
20171554001        1220636        390        26/12/2017     15/01/2018
20171554001        1220636        290        16/1/2018      29/01/2018
20171554001        1220636        190        30/1/2018      05/02/2018
20171554001        1220636        90         6/2/2018       12/06/2018
20171554001        1220636        599        13/6/2018      24/08/2018

Please help me to achieve this.

  • 1
    Hi Vineet. When you paste in code or tables like this and need to preserve whitespace and line feeds, highlight the pasted chunk and hit the `{}` button (or ctrl+k). We should have an edit approved here in a moment where someone has done that already. – JNevill Aug 24 '18 at 13:08

2 Answers2

0

You are looking for GROUP BY

SELECT
    Article
    , EANUPC
    , MRP
    , MIN(DataDate) as FromDate
    , MAX(DataDate) as ToDate
FROM
    [Your_Table]
GROUP BY
    Article, EANUPC, MRP

You are going to need create a new table and insert these records into it but it should work. Also, if it has billions of records you will probable want to work through it in chunks so that it does not bog down your server.

Dillon_Su
  • 91
  • 7
0

You can treat this as a gaps and islands problem. Using:

SELECT Article, EANUPC, MRP, DataDate,
       ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
       ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
ORDER BY DataDate

you get:

Article     EANUPC  MRP DataDate    grp
---------------------------------------
20171554001 1220636 599 2015-11-20  0
20171554001 1220636 599 2017-12-15  0
20171554001 1220636 390 2017-12-26  2
20171554001 1220636 390 2018-01-11  2
20171554001 1220636 290 2018-01-16  4
20171554001 1220636 290 2018-01-24  4
20171554001 1220636 190 2018-01-30  6
20171554001 1220636 90  2018-02-06  7
20171554001 1220636 90  2018-06-13  7
20171554001 1220636 599 2018-06-13  7

You can now wrap the above query in a CTE and perform grouping using grp field:

;WITH CTE AS (
    SELECT Article, EANUPC, MRP, DataDate,
           ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
           ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
    FROM mytable
)
SELECT Article, EANUPC, MRP, MIN(DataDate) AS FromDate
FROM CTE 
GROUP BY Article, EANUPC, MRP, grp 

Output:

Article     EANUPC  MRP FromDate
----------------------------------
20171554001 1220636 599 2015-11-20
20171554001 1220636 390 2017-12-26
20171554001 1220636 290 2018-01-16
20171554001 1220636 190 2018-01-30
20171554001 1220636 90  2018-02-06
20171554001 1220636 599 2018-06-13

You can get the final result using LEAD:

;WITH CTE AS (
    SELECT Article, EANUPC, MRP, DataDate,
           ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
           ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
    FROM mytable
), CTE2 AS (
    SELECT Article, EANUPC, MRP, MIN(DataDate) AS FromDate
    FROM CTE 
    GROUP BY Article, EANUPC, MRP,grp  
)
SELECT Article, EANUPC, MRP, FromDate, 
       COALESCE(DATEADD(day, -1, LEAD(FromDate) OVER (ORDER BY FromDate)), FromDate) AS ToDate
FROM CTE2

Output:

Article     EANUPC  MRP FromDate    ToDate
----------------------------------------------
20171554001 1220636 599 2015-11-20  2017-12-25
20171554001 1220636 390 2017-12-26  2018-01-15
20171554001 1220636 290 2018-01-16  2018-01-29
20171554001 1220636 190 2018-01-30  2018-02-05
20171554001 1220636 90  2018-02-06  2018-06-12
20171554001 1220636 599 2018-06-13  2018-06-13
Giorgos Betsos
  • 71,379
  • 9
  • 63
  • 98