1

I'm struggling with some SQL, which should both join data from two tables together, but where I also need to parse a JSON array, into rows in the output table...

Here are examples of my 2 tables that I'd like to join together:

Order

Order_Number Delivery_Date
1 2020-05-18 07:00:00.000
2 2020-08-31 07:30:00.000

And History:

Order_Number History_Details
1 [{"FieldName":"OrderStatusType.Description","FieldType":"String","ValueBefore":"Delivered","ValueAfter":"Undelivered","Action":2}]
1 [{"FieldName":"VoucherCode","FieldType":"String","ValueBefore":"","ValueAfter":"64646456","Action":1},{"FieldName":"PricingType.Description","FieldType":"String","ValueBefore":"Standard","ValueAfter":"Discount","Action":2}]
2 [{"FieldName":"InitialComment","FieldType":"String","ValueBefore":"","ValueAfter":"Test Comment","Action":2},{"FieldName":"Appointment.Date","FieldType":"DateTime","ValueBefore":"2020-08-06T07:30:00.000","ValueAfter":"2020-08-31T07:30:00.000","Action":0}]
2 null

(That's unfortunately, a string "null", not NULL - we'll have to deal with when joining the data)

What I'd like to get to is an output like this:

Order Number Delivery Date FieldName ValueBefore ValueAfter
1 2020-05-18 07:00:00.000 OrderStatusType.Description Delivered Undelivered
1 2020-05-18 07:00:00.000 VoucherCode 64646456
1 2020-05-18 07:00:00.000 PricingType.Description Standard Discount
2 2020-08-31 07:30:00.000 InitialComment Test Comment
2 2020-08-31 07:30:00.000 Appointment.Date 2020-08-06T07:30:00.000 2020-08-31T07:30:00.000

I can do the 2 queries on their own, I'm just struggling to join them...

I.e. this gives me everything without the JSON broken up:

SELECT
  o.order_number as [Order Number],
  o.delivery_date as [Delivery Date],
  oh.history_details as [History]
FROM [dbo].[Order] o
JOIN [dbo].[History] oh on oh.order_number = o.order_number
WHERE oh.history_details != 'null'

While I can do JSON Mapping like:

DECLARE @json NVARCHAR(MAX)
SET @json='[{"FieldName":"VoucherCode","FieldType":"String","ValueBefore":"","ValueAfter":"64646456","Action":1},{"FieldName":"PricingType.Description","FieldType":"String","ValueBefore":"Standard","ValueAfter":"Discount","Action":2}]';
SELECT *
FROM OPENJSON(@json)
WITH (   
   FieldName   varchar(200) '$.FieldName' ,  
   ValueBefore     varchar(200)     '$.ValueBefore',  
   ValueAfter varchar(200) '$.ValueAfter'  
 ) 

I'm just really struggling to work out how to join those two together and get the result I was looking for, a lot of the OPENJSON examples are like the one I have above, they seem to expect a single column/datatype.

John McDonnell
  • 753
  • 1
  • 8
  • 24
  • Have you tried CROSS APPLYing OpenJson(History_Details)? – AlwaysLearning Mar 26 '21 at 11:50
  • You can't have different data types in the same column within a SQL database, so your desired output columns `ValueBefore` and `ValueAfter` aren't possible without holding the values as a text string. Do you want to have `date` values as text in those columns or are you going to handle that in another way? – iamdave Mar 26 '21 at 12:00
  • @iamdave I thought that because all the JSON values are wrapped in double-quotes(even the numbers), it's simpler to keep them all as strings and not have to worry about that complexity. – John McDonnell Mar 26 '21 at 13:25
  • @AlwaysLearning I did try that but I struggled to work out how to use that. – John McDonnell Mar 26 '21 at 13:25
  • @JohnMcDonnell You can see both approaches in my answer below – iamdave Mar 26 '21 at 13:31

2 Answers2

1

There is a bit of a problem in your question here, in that your desired output contradicts the rest of your question. You can't have different data types within the same column, so your desired output is not possible whilst retaining the appropriate data types for each value.

Your options are to either keep the data as a text string, as it is displayed in the JSON or to have multiple columns for every possible data type that could be captured.

If this is just a log table to capture the change history, which you won't be regularly querying, you can probably get away with just keeping the values as text strings.


Query

declare @Order table(Order_Number int,Delivery_Date datetime);
insert into @Order values
 (1,'2020-05-18 07:00:00.000')
,(2,'2020-08-31 07:30:00.000')
;

declare @History table(Order_Number int, History_Details varchar(max));
insert into @History values
 (1,'[{"FieldName":"OrderStatusType.Description","FieldType":"String","ValueBefore":"Delivered","ValueAfter":"Undelivered","Action":2}]')
,(1,'[{"FieldName":"VoucherCode","FieldType":"String","ValueBefore":"","ValueAfter":"64646456","Action":1},{"FieldName":"PricingType.Description","FieldType":"String","ValueBefore":"Standard","ValueAfter":"Discount","Action":2}]')
,(2,'[{"FieldName":"InitialComment","FieldType":"String","ValueBefore":"","ValueAfter":"Test Comment","Action":2},{"FieldName":"Appointment.Date","FieldType":"DateTime","ValueBefore":"2020-08-06T07:30:00.000","ValueAfter":"2020-08-31T07:30:00.000","Action":0}]')
,(2,'null')
;

select o.Order_Number
      ,o.Delivery_Date
      ,j.FieldName
      ,j.ValueBefore
      ,j.ValueAfter

      ,case when j.FieldType = 'String' then j.ValueBefore end as ValueBeforeString
      ,case when j.FieldType = 'DateTime' then try_convert(datetime,j.ValueBefore,127) end as ValueBeforeDateTime
      ,case when j.FieldType = 'String' then j.ValueAfter end as ValueAfterString
      ,case when j.FieldType = 'DateTime' then try_convert(datetime,j.ValueAfter,127) end as ValueAfterDateTime
from @Order as o
    join @History as h
        on o.Order_Number = h.Order_Number
    outer apply openjson(h.History_Details)
                with (FieldName   varchar(200) '$.FieldName' 
                     ,FieldType   varchar(200) '$.FieldType'
                     ,ValueBefore varchar(200) '$.ValueBefore'
                     ,ValueAfter  varchar(200) '$.ValueAfter'
                     ) as j
where h.History_Details <> 'null'
order by o.Order_Number
        ,o.Delivery_Date;

Output

Order_Number Delivery_Date FieldName ValueBefore ValueAfter ValueBeforeString ValueBeforeDateTime ValueAfterString ValueAfterDateTime
1 2020-05-18 07:00:00.000 OrderStatusType.Description Delivered Undelivered Delivered NULL Undelivered NULL
1 2020-05-18 07:00:00.000 VoucherCode 64646456 NULL 64646456 NULL
1 2020-05-18 07:00:00.000 PricingType.Description Standard Discount Standard NULL Discount NULL
2 2020-08-31 07:30:00.000 InitialComment Test Comment NULL Test Comment NULL
2 2020-08-31 07:30:00.000 Appointment.Date 2020-08-06T07:30:00.000 2020-08-31T07:30:00.000 NULL 2020-08-06 07:30:00.000 NULL 2020-08-31 07:30:00.000
iamdave
  • 12,023
  • 3
  • 24
  • 53
  • This is perfect, thank you. I've kept everything as a "string", as that's perfect for my use case, so I didn't need to include the case statements. – John McDonnell Mar 26 '21 at 13:50
1

You can use OUTER APPLY to push the data in from the History table into OPENJSON.

We can use NULLIF to null out any JSON which says the string 'null'

SELECT
  o.order_number as [Order Number],
  o.delivery_date as [Delivery Date],
  j.FieldName,
  j.ValueBefore,
  j.ValueAfter
FROM [dbo].[Order] o
JOIN [dbo].[History] oh on oh.order_number = o.order_number
OUTER APPLY OPENJSON(NULLIF(oh.history_details, 'null'))
WITH (   
   FieldName   varchar(200) '$.FieldName' ,  
   ValueBefore     varchar(200)     '$.ValueBefore',  
   ValueAfter varchar(200) '$.ValueAfter'  
 ) j

As mentioned by others, you cannot have different data types in the same column, so it would have to remain varchar.

Charlieface
  • 52,284
  • 6
  • 19
  • 43