2

Good day,

I am using MS T-SQL in SSMS. I am trying to determine a way to base a window function on a variable or a case statement. I need the window to continue to loop until a certain result happens and then cut the window by those observations. Additionally, perhaps there is better way to do this outside of Windows also, I’ve researched the WHILE expression as a potential options also.

Here is an example dataset:

   CREATE TABLE #Items ([Transaction] int, [Item] varchar(10), [Price] money, [Price2] varchar(25), [SaleDate] smalldatetime)

   INSERT INTO   #Items ([Transaction], [Item], [Price], [Price2], [SaleDate])
   VALUES (123,'Blue',1100,'Blue1100','04/10/2018'),
   (124,'Blue',1100,'Blue1100','04/09/2018'),
   (125,'Blue',1100,'Blue1100','04/09/2018'),
   (126,'Blue',3000,'Blue3000','03/27/2018'),
   (127,'Blue',1100,'Blue1100','03/27/2018'),
   (128,'Blue',1100,'Blue1100','03/27/2018'),
   (129,'Blue',1100,'Blue1100','03/27/2018'),
   (130,'Blue',1100,'Blue1100','03/27/2018'),
   (131,'Red',3328,'Red3328','04/12/2018'),
   (132,'Red',3328,'Red3328','04/09/2018'),
   (133,'Red',3328,'Red3328','04/06/2018'),
   (134,'Red',3328,'Red3328','04/04/2018'),
   (135,'Red',3328,'Red3328','04/02/2018'),
   (136,'Red',3328,'Red3328','04/02/2018'),
   (137,'Yellow',1340,'Yellow1340','04/09/2018'),
   (138,'Yellow',1340,'Yellow1340','04/08/2018'),
   (139,'Yellow',1340,'Yellow1340','04/08/2018'),
   (140,'Yellow',1500,'Yellow1500','04/05/2018'),
   (141,'Yellow',1340,'Yellow1340','04/05/2018'),
   (142,'Yellow',1340,'Yellow1340','04/05/2018'),
   (143,'Yellow',1340,'Yellow1340','04/02/2018'),
   (144,'Yellow',1340,'Yellow1340','03/31/2018')

In this case I’d like the window to look at each [Item] sorted by [SaleDate] desc and create the window size by all orders sequential that have the same [Price]. So long as there is no interruption in [Price] the window would be based on those observations. Once that determination is made, I would like to add a column that pulls in the minimum [SaleDate] for those observations in the window.

The desired result is on the last column of this table.
enter image description here

The following a query I’m trying to use but can’t figure out how to get the window to loop:

SELECT 
Transaction
, Item
, Price
,Price2
, SaleDate
,(CASE WHEN PRICE2 = LEAD(PRICE2) OVER (ORDER BY Price ASC, SaleDate DESC ) 
AND LEAD(PRICE2) OVER (ORDER BY Price ASC, SaleDate DESC ) = LEAD(PRICE2) OVER (ORDER BY Price ASC, SaleDate DESC ) 
THEN MIN(SaleDate) ELSE NULL END) 'StartDate'

FROM #Items

GROUP BY 
Transaction
, Item
, Price
,Price2
, SaleDate

Any and all suggetions are welcome! Thank you!

DxTx
  • 3,049
  • 3
  • 23
  • 34
Shore2020
  • 23
  • 2

2 Answers2

2

This can be done via a Recursive CTE.

Example SQL:

;with CTE as (
    SELECT [Transaction], Item, Price, Price2, SaleDate, 
    row_number() over (partition by Item order by SaleDate asc, [Transaction] desc) as rn
    FROM #Items
)
, RCTE as (
     SELECT 
       c.*, 
       SaleDate as SaleDatePriceChange
     from CTE c
     where rn = 1

     union all

     SELECT 
       c.*,
       case when c.Price = r.Price then r.SaleDatePriceChange else c.SaleDate end
     from RCTE r 
     join CTE c on c.Item = r.Item and c.rn = r.rn + 1   

)
select [Transaction], Item, Price, Price2, SaleDate, SaleDatePriceChange
from RCTE
order by Item, rn desc;

The first CTE is used to add a row_number.

Which makes it easier to join to the next record in the Recursive CTE.

LukStorms
  • 28,916
  • 5
  • 31
  • 45
  • Thanks for the rep. Btw, just a remark. You'll notice that the CTE orders by descending transaction. That's to get those expected results. And I don't know your real data, so it's probably as it should be. But personally I found it a bit odd that the transaction has an opposite order of the SaleDate.But then again, this is just an example, and from there you can fiddle with the used order to make it suit your real data. – LukStorms May 14 '18 at 08:41
  • Yeah, I had to flip from DESC to ASC on the row number CTE. I an issue I'm running into now is this errors out, I get "The statement terminated. The maximum recursion 100 has been exhausted before statement completion." I have tried using 'option (maxrecursion 0)' for the RCTE but this runs for a long time (6 hours+). This is running on about six million rows so I may have to break it up in chucks. I don't believe I have an infinite loops that would cause an issue. – Shore2020 May 15 '18 at 21:35
  • @Shore2020 A problem to the speed could be that it joins to the calculated RN. So it doesn't use an index. But linking to an primary key, f.e. `join #Items c on c.Item = r.Item and c.ID = r.ID + 1` would use an index. But that would also assume that there are no gaps between the numbers. And I assume you want to run your query on the real big table. Well, perhaps you should unaccept my answer. And update your question. Originally I tried to find another way than going recursive. But I didn't find it at the time, so I went with this. Maybe someone has inspiration for a faster method. – LukStorms May 15 '18 at 22:13
  • I am going to tinker with this for a little while, then if I don't get it to work, I'll come back and update to see if others have any ideas for performance. Thanks for all the help! – Shore2020 May 16 '18 at 14:28
0

Here's another method to get those start dates.
No recursive this time.

It uses LAG to get the Price of the previous record.
So that the previous Price can be compared with the Price of the record.
The IIF will then return 1 if the price changed, 0 if not.

A SUM window function will add the 1 and 0 up, which results in a ranking.
The trick is that the rank number doesn't increase when 0 is added.

And then FIRST_VALUE is used to get the first SaleDate within it's rank.

select 
  [Transaction], Item, Price, Price2, SaleDate, 
  first_value(t.SaleDate) over (partition by t.Item, q2.PriceChangeRank order by t.SaleDate, t.[Transaction] desc) as PriceChangedDate
from
(
    select TransactionID,
      sum(PriceChanged) over (partition by Item order by SaleDate, TransactionID desc) as PriceChangeRank
    from
    (
        select [Transaction] as TransactionID, Item, SaleDate,
          iif(Price != lag(Price) over (partition by Item order by SaleDate, [Transaction] desc),1,0)  as PriceChanged
        from #Items
    ) q1
) q2
join #Items t on (t.[Transaction] = q2.TransactionID)
order by [Transaction];

Not sure if this method will faster than with the Dynamic SQL. But it's an alternative.

LukStorms
  • 28,916
  • 5
  • 31
  • 45