5

I have a table with data like this

+-------------+--------------+------------+----------------+
| CustomerSID | StartDateSID | EndDateSID | MarketingOptIn |
+-------------+--------------+------------+----------------+
|       12345 |     20101019 |   20131016 | Y              |
|       12345 |     20131017 |   20140413 | Y              |
|       12345 |     20140414 |   20140817 | N              |
|       12345 |     20140818 |   20141228 | N              |
|       12345 |     20141229 |   20150125 | Y              |
|       12345 |     20150126 |          0 | Y              |
+-------------+--------------+------------+----------------+

I need to create a view on top of this table to have data formatted in the below format for the Flag, basically the duration for which the Flag was Y or N. (EndDateSID - 0 is currently active, so today's date)

+-------------+--------------+------------+----------------+
| CustomerSID | StartDateSID | EndDateSID | MarketingOptIn |
+-------------+--------------+------------+----------------+
|       12345 |     20101019 |   20140413 | Y              |
|       12345 |     20140414 |   20141228 | N              |
|       12345 |     20141229 |   20150825 | Y              |
+-------------+--------------+------------+----------------+

Most customers only have a change in their Flag once, hence below query works:

SELECT 
CH1.CustomerSID
,MIN(CH1.StartDateSID) StartDate
,MAX(ISNULL(NULLIF(CH1.EndDateSID,0),CONVERT(INT, CONVERT(VARCHAR, GETDATE(), 112)))) EndDate
,CH1.MarketingOptIn 
FROM DWH.DimCustomerHistory CH1
GROUP BY CH1.CustomerSID, CH1.MarketingOptIn
ORDER BY CH1.CustomerSID, CH1.MarketingOptIn

How can I achieve the intended output for customers like the one above, having changes in the flag more than once?

Rohit P
  • 613
  • 5
  • 13
  • Hello Rohit welcome to StackOverflow, next time try to provide a [**SqlFiddle**](http://sqlfiddle.com/#!15/5368b/6) so we can understand the problem better and give you an answer much faster – Also please read [**How to ask**](http://stackoverflow.com/help/how-to-ask) Also read [**How to create a Minimal, Complete, and Verifiable example.**](http://stackoverflow.com/help/mcve) – Juan Carlos Oropeza Aug 25 '15 at 16:44
  • possible duplicate of [Group data by the change of grouping column value in order](http://stackoverflow.com/questions/10110026/group-data-by-the-change-of-grouping-column-value-in-order) – Bulat Aug 25 '15 at 16:49
  • Hi @JuanCarlosOropeza, thank you for the suggestions, I'll follow them next time onwards. – Rohit P Aug 26 '15 at 09:53

2 Answers2

6

You can use the following query:

SELECT CustomerSID,
       MIN(StartDateSID) AS StartDate,
       MAX(ISNULL(NULLIF(EndDateSID,0),
           CONVERT(INT, CONVERT(VARCHAR, GETDATE(), 112)))) AS EndDate,
       MarketingOptIn
FROM (       
  SELECT CustomerSID, StartDateSID, EndDateSID, MarketingOptIn,
         ROW_NUMBER() OVER (ORDER BY StartDateSID) - 
         ROW_NUMBER() OVER (PARTITION BY CustomerSID, MarketingOptIn 
                            ORDER BY StartDateSID) AS grp     
  FROM DimCustomerHistory ) AS t
GROUP BY CustomerSID, MarketingOptIn, grp
ORDER BY StartDate

Calculated field grp serves to identify consecutive records having the same MarketingOptIn value.

Utilizing this field in an outer query, you can easily GROUP BY and apply MIN and MAX aggregate functions in a way similar to your original query.

Demo here

Giorgos Betsos
  • 71,379
  • 9
  • 63
  • 98
5

This is a gaps and islands problem. You need to use ROW_NUMBER() to identify your gaps, so the start stage would be:

SELECT  CustomerSID,
        StartDateSID,
        EndDateSID,
        MarketingOptIn,
        TotalRowNum = ROW_NUMBER() OVER(PARTITION BY CustomerSID ORDER BY StartDateSID),
        RowNumInGroup = ROW_NUMBER() OVER(PARTITION BY CustomerSID, MarketingOptIn ORDER BY StartDateSID),
        GroupID = ROW_NUMBER() OVER(PARTITION BY CustomerSID ORDER BY StartDateSID) -   
                ROW_NUMBER() OVER(PARTITION BY CustomerSID, MarketingOptIn ORDER BY StartDateSID)
FROM    dbo.YourTable;

Output:

CustomerSID StartDateSID    EndDateSID  MarketingOptIn  TotalRowNum RowNumInGroup   GroupID
---------------------------------------------------------------------------------------------------
12345       20101019        20131016    Y               1           1               0
12345       20131017        20140413    Y               2           2               0
12345       20140414        20140817    N               3           1               2
12345       20140818        20141228    N               4           2               2
12345       20141229        20150125    Y               5           3               2
12345       20150126        0           Y               6           4               2

The key here is that by taking the row number of each row, and also the row number of each row withing the group, you can get a unique identifier (GroupID + MarketingOptIn) which identifies each of your islands. Then it is just a case of grouping by this identifier when doing your aggregates:

FULL WORKING EXAMPLE

DECLARE @T TABLE 
(   
    CustomerSID INT,
    StartDateSID INT,
    EndDateSID INT,
    MarketingOptIn CHAR(1)
)
INSERT @T 
VALUES
    (12345, 20101019, 20131016, 'Y'),
    (12345, 20131017, 20140413, 'Y'),
    (12345, 20140414, 20140817, 'N'),
    (12345, 20140818, 20141228, 'N'),
    (12345, 20141229, 20150125, 'Y'),
    (12345, 20150126, 0, 'Y');


WITH CTE AS
(
    SELECT  CustomerSID,
            StartDateSID,
            EndDateSID,
            MarketingOptIn,
            GroupID = ROW_NUMBER() OVER(PARTITION BY CustomerSID ORDER BY StartDateSID) -   
                    ROW_NUMBER() OVER(PARTITION BY CustomerSID, MarketingOptIn ORDER BY StartDateSID)
    FROM    @T
)
SELECT  CustomerSID, 
        StartDateSID = MIN(StartDateSID),
        EndDateSID = CASE WHEN MIN(EndDateSID) = 0 THEN CONVERT(INT, CONVERT(VARCHAR(8), GETDATE(), 112)) ELSE MAX(EndDateSID) END,
        MarketingOptIn
FROM    CTE
GROUP BY CustomerSID, MarketingOptIn, GroupID
ORDER BY CustomerSID, StartDateSID;
GarethD
  • 68,045
  • 10
  • 83
  • 123