Join large table with JSON data with another table

Question

We are using Azure SQL database having tables for saving document information, document approvers, package information etc.

In document information table we save metadata in JSON string since metadata doesn't have fixed schema.

We are facing issues due to increasing number of rows in document information day by day.

For making report we are using OPENJSON on the metadata column and then PIVOT to flatten and then join on document approvers table.

select * from (
    select DocInfo.DocName, MyJSON.COLID as COLID, MyJSON.Value as Val, DocInfo.DocSource
    from DocInfo 
    outer apply
    OPENJSON(DocInfo.DocMetadata)
    with (
        COLID INT '$.ID',
        Value nvarchar(max) '$.Value'
    )  as MyJSON
    inner join PackageInfo on
    PackageInfo.ID=DocInfo.PackageID
) t 
Pivot (
    max(Val)
    for COLID in (
    [49])
) as pivot_table

Metadata JSON value is e.g.

[{"ID":74,"Value":"20000","Name":"Amount"},{"ID":42,"Value":"North America","Name":"Business Area"},
{"ID":85,"Value":"city1","Name":"City"},{"ID":49,"Value":"Client Name 123","Name":"Client"},
{"ID":90,"Value":"ClientField1 Value","Name":"ClientField1"},
{"ID":84,"Value":"Client Field 123","Name":"ClientField2"}]

ID, Name, Value is metadata column ID, Name and Value respectively.

Suppose number of documents is 200000 and JSON values are with 8 fields then on OPENJSON outer apply we get 1600000 rows. We then Pivot the data which makes number of rows to 200000 again.

Most of the times we need only 2 or 3 metadata values from JSON. E.g. Business Area, Client, Amount.

Is there a way to use outer apply with OPENJSON on specific fields only? I am thinking this will improve the query performance.

Any pointer will be helpful or if there is any other way to improve the performance.

Thank you!

Scrap the pivot and replace it with `WHERE COLID = 49` maybe? To be honest, the JSON iitself is unnecessarily verbose, it could easily be `{"Amount":"20000", ""Business Area"":"North America", "City":"city1", "Client":"Client Name 123", "ClientField1":"ClientField1 Value", "ClientField2":"Client Field 123"}` and then a `OPENJSON` would have just one row to deal with — Charlieface, Jul 25 '22 at 16:13
@Charlieface Agreed we can reduce JSON as per your suggestion. Regarding using COLID = 49, how can I use if I need to use 2 metadata values without using Pivot? — Sushrut Paranjape, Jul 26 '22 at 10:25
You're right you cannot with your current JSON design. But it might be better to do a maunal pivot using `MAX(CASE` https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=ab633dd560c3dcd9f7354637c712ec70. If you sorted out the JSON you wouldn't need to do this as it would all be in one row — Charlieface, Jul 26 '22 at 10:30
Thank you @Charlieface, this improved the time taken for the report considerably when we use 2-3 fields only. — Sushrut Paranjape, Jul 27 '22 at 12:05

Join large table with JSON data with another table

0 Answers0