1

I saw sometimes you can use a cross apply, but I feel it won't work in this case as I have 10 columns for "Days" (for 10 years) and 10 columns for "Discharges" (for 10 years).....so I need this pivoted into 10 different row per zip and age group.

Starting output:

Zipcode Age 2022 Days 2023 Days 2022 Discharges 2023 Discharges
12345 1-10 1 2 3 4
12345 11-20 5 6 7 8

Desired output:

Zipcode Age Year Days Discharges
12345 1-10 2022 1 2
12345 1-10 2023 3 4
12345 11-20 2022 5 6
12345 11-20 2022 7 8

The top is essentially what I have (Just only kept 2022 and 2023, but will really be 2022 - 2032).

Is it possible to do two unpivots, a cross apply or something else to achieve this? This is also a realllllllly large dataset (1.5 billion rows), so I'll have to do it in batches, but would rather not do one unpivot, and then another unpivot.

Thanks in advance!

manavjn
  • 15
  • 4

2 Answers2

0

This will dynamically UNPIVOT your data without having to use Dynamic SQL or having to list the desired columns

Note: There will be a price to pay for 1.5 billion rows.

You could use the VALUE approach to UNPIVOT your data, but you would have to specify all the columns up to 2032

Example or dbFiddle

Select A.Zip_Code
      ,A.Age_Cohort
      ,B.*
From  YourTable A
 Cross Apply ( Select [Year]       = try_convert(int,Left([Key],4))
                     ,[Days]       = sum( case when [key] like '%Days%'       then try_convert(int,Value) else 0 end)
                     ,[Discharges] = sum( case when [key] like '%Discharges%' then try_convert(int,Value) else 0 end)
                From  OpenJson((Select A.* For JSON Path,Without_Array_Wrapper )) 
                Where [Key] like '[0-9][0-9][0-9][0-9]_D%'  -- or you can use [Key] not in ('Zip_Code','Age_Cohort')
                Group By Left([Key],4)
             ) B

Results

enter image description here

John Cappelletti
  • 79,615
  • 7
  • 44
  • 66
  • Thanks @JohnCappelletti, after running the whole table, it will be about 16 billion rows. Do you think there would be a better way to store the table? Splitting up the Days and Discharges maybe? Or is this really the best method and best way to store it? – manavjn May 06 '23 at 20:34
  • @manavjn Your desired structure is certainly more usable and easier to analyze. That said, I have no idea what your USE CASE is. – John Cappelletti May 06 '23 at 20:40
  • @manavjn If you want to test performance just add >>> WHERE Zip_Code Like '35%' – John Cappelletti May 06 '23 at 20:47
0

You can unpivot using CROSS APPLY (VALUES, this allows you to generate multiple rows from a single outer row.

SELECT
  t.Zip_Code,
  t.Age_Cohort,
  v.*
FROM @YourTable t
CROSS APPLY (VALUES
    (2022, t.[2022_Days], t.[2022_Discharges]),
    (2023, t.[2023_Days], t.[2023_Discharges])
) v(Year, Days, Discharges);

db<>fiddle

Charlieface
  • 52,284
  • 6
  • 19
  • 43