0

I've been having one of those weeks...

I am creating a simple POC for the Director, who is looking for weekly stats of order values shipping.

The scenario they asked is "At the end of each week (Sunday), I want to see the orders total ($), for the next 52 Weeks, based on the shipping date".

Standard SCD2 schemas, expanded below. Simple Date Dimension.

This is simple to answer in its basic form, DIMDATE.DATEKEY_SK = ORDERS.ORDERDATE_SK WHERE ORDERS.CURRENTRECORD ='Y' GROUP BY DIMDATE.WEEKEND. Which gives me all required DIMDATE attributes e.g. WeekStart, WeekEnd, MonthStart, MonthEnd etc etc.

However, the Director has now said he would like to see the previous 12 weeks as well, (-12/+52) in order to compare week on week trends.

Now i could insert all data from the "simple" query into a snapshot table each week which would solve the problem BUT i feel this is the cheats way out and there must be a way to do the same through T-SQL. This is where i am stuck because i have [EffectiveDate] and [EndDate] in my ORDERS table, along with the [OrderShipDate_SK].

Example Schemas Below:

DIMDATE

DATEKEY_SK WeekStart WeekEnd
20230101 2023/01/01 2023/01/07
20250131 2023/01/29 2023/02/04

ORDERS

ORDERKEY_SK EffectiveDate EndDate OrderId OrderAmount OrderShipDate_SK CurrentRecord
1234 2023/01/01 2023/01/03 IF76876 35000 20230107 N
1234 2023/01/04 2023/01/08 IF76876 37000 20230107 N
3345 2023/01/09 2999/12/31 IF76876 40000 20230114 Y
1432 2023/01/01 2023/01/12 IF89996 10000 20230107 N
5456 2023/01/13 2999/12/31 IF89996 12000 20230114 Y
1932 2023/01/01 2023/01/22 IF62336 10000 20230131 N
7456 2023/01/23 2999/12/31 IF62336 12000 20230218 Y

What I'm after is like the below:

WeekStart WeekEnd OrderTotal ShippingWeekEnd (OrderShipDate)
2023/01/01 2023/01/07 47000 2023/01/07
2023/01/01 2023/01/07 0 2023/01/14
2023/01/01 2023/01/07 0 2023/01/21
2023/01/01 2023/01/07 0 2023/01/28
2023/01/01 2023/01/07 10000 2023/02/04
2023/01/08 2023/01/14 52000 2023/01/14
2023/01/08 2023/01/14 0 2023/01/21
2023/01/08 2023/01/14 0 2023/01/28
2023/01/08 2023/01/14 10000 2023/02/04
2023/01/15 2023/01/21 0 2023/01/21
2023/01/15 2023/01/21 0 2023/01/28
2023/01/15 2023/01/21 10000 2023/02/04
2023/01/15 2023/01/21 0 2023/02/11
2023/01/22 2023/01/28 12000 2023/01/28
2023/01/22 2023/01/28 0 2023/02/04
2023/01/22 2023/01/28 0 2023/02/11
2023/01/22 2023/01/28 12000 2023/02/18

I want to show (64) rows for each week start/end with the sum of orders, grouped by the shipping week end (always Saturdays) E.g. If the order changes mid week only the latest record during that week is factored in at the end of the week.

I just cant figure out how i need to construct the query to do this. Existing Query Below

SELECT SUM(OrderTotal),DIMDATE.WeekEnd
FROM ORDERS
LEFT JOIN DIMDATE ON DIMDATE.DATEKEY_SK = ORDERS.OrderShipDate_SK
WHERE ORDERS.CurrentRecord = 'Y'
GROUP BY DIMDATE.WeekEnd DESC
a.Smith
  • 11
  • 4
  • Hi - it’s unclear, to me, what you are trying to achieve. Can’t you just take the current record for each order and group by the Ship Date? What are these 24 rows for each week? It’s not obvious how the sample data you’ve provided equates to the result table you’ve given: how do you get 47000 for 2023/01/07? – NickW Jan 23 '23 at 19:28
  • Hi - 24 rows for each week was a typo. I need to show 64 rows. For example, this week starts on 2023/01/22. Therefore i would need 64 rows dated 2023/01/22, showing the last 12 weeks order totals and the next 52 weeks order totals. The 47000 is because the order ID IF76876 was amended on 2023/01/04 to 37000 and would have been the current record for that week. Add to this order IF89996 for 10000. Both orders were due to be shipped on 2023/01/07. Basically, i cant group as suggested because the current record may not have existed on that particular week when looking back at history (12 weeks). – a.Smith Jan 24 '23 at 08:28
  • Hi - so do you just want to include the version of an order record that was effective on the Order Ship Date? If so then you just need a WHERE clause that does something like (pseudo-code): EffectiveDate <= to_date(OrderShip_Date_SK) AND EndDate >= to_date(OrderShip_Date_SK) – NickW Jan 24 '23 at 13:42

1 Answers1

1

I think i have cracked it after sleeping on it.

First use the date dimensions to list all possible week start combinations, then do some extra calculations to generate some other dates.

WITH
DATES AS (
SELECT distinct
DDATE.FirstOfWeek as 'WeekStart',
CAST(REPLACE(DDATE.FirstOfWeek,'-','')as int) AS 'skWeekstart', -- Want this as int for joins
DDATE.LastOfWeek,
CAST(REPLACE(DDATE.LastOfWeek,'-','')as int) AS 'skWeekend', -- Want this as int for joins
CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, FirstOfWeek)-1, 0) as date) AS LastMonth,
CAST(REPLACE(CAST(DATEADD(month, DATEDIFF(month, 0, FirstOfWeek), 0) as date),'-','') as int) AS MonthStart,
CAST(REPLACE(EOMONTH(FirstOfWeek),'-','') as int) as MonthEnd
FROM vw_dim_date DDATE
WHERE EOMONTH(FirstOfWeek) > '2021-01-01' AND LastOfWeek <= DATEADD(wk, DATEDIFF(wk, 6, CURRENT_TIMESTAMP), 6 + 7)  --- No point looking into the future as it hasnt happened yet.
),
WeekStart skWeekstart LastOfWeek skWeekend LastMonth MonthStart MonthEnd
2021-01-03 20210103 2021-01-09 20210109 2020-12-01 20210101 20210131
2021-01-10 20210110 2021-01-16 20210116 2020-12-01 20210101 20210131
2021-01-17 20210117 2021-01-23 20210123 2020-12-01 20210101 20210131
2021-01-24 20210124 2021-01-30 20210130 2020-12-01 20210101 20210131
2021-01-31 20210131 2021-02-06 20210206 2020-12-01 20210101 20210131

Now get all the surrogate keys for the ORDERS and the effective dates of those records.


--- Get all the surrogate keys for ORDERS and the effective dates. Only looking for non deleted records and those with a shipping date 
ORDERS AS (
SELECT 
skOrder,orderid,
CAST(REPLACE(EffectiveDate,'-','') as int) as skeffdate,
CAST(REPLACE(EndDate,'-','') as int) as skenddate
FROM vw_dim_Orders
WHERE IsDeleted =0 AND skShippingDate <>-1
),
skOrder OrderId skeffdate skenddate
12005892 IG67787 20220325 20220710
12017996 IG577872 20220711 99991231
12000030 IH66798 20211110 20220324

Join back to the DATES CTE and group by Date attributes. Join is using skeffectivedate < skweekend which works because im selecting the latest surrogate key for that record for that week.

--- Now join back into the weeks to get a list of ORDERS for each week.
--- 1) Join on ORDER effective date < dates current weekend
--- 2) Get the MAX surrogate Key for that ORDER for that week. This ensures the latest record for that week is taken. E.g. 2 + changes, only the last change for that week is taken.
ORDERSKS AS (
SELECT DATES.*,
MAX(skOrders) AS skOrders
FROM DATES
LEFT JOIN ORDERS ON (skeffdate < skWeekend)
GROUP BY DATES.WeekStart,DATES.skWeekstart,DATES.LastOfWeek,DATES.LastMonth,DATES.MonthStart,DATES.MonthEnd,skWeekend,orderId
)

This provides a list of ORDER surrogate keys which were valid on that week. Natuarlly there are millions of rows returned, but the results are narrow so it runs in seconds.

WeekStart skWeekstart LastOfWeek skWeekend LastMonth MonthStart MonthEnd skOrder
2023-01-22 20230122 2023-01-28 20230128 2022-12-01 20230101 20230131 12011381
2023-01-22 20230122 2023-01-28 20230128 2022-12-01 20230101 20230131 12028584
2023-01-22 20230122 2023-01-28 20230128 2022-12-01 20230101 20230131 12014886
2023-01-22 20230122 2023-01-28 20230128 2022-12-01 20230101 20230131 12011382

With this i can join back to the dimension on the SK to get the shipping dates and order totals and tally from there.

It may not be the prettiest but it gives me every single order record for each week, factoring in if the record was updated mid week. This is really only for history to be able to look back 12 weeks, but i can look back as far as i have data now.

a.Smith
  • 11
  • 4