Create a virtual column and index to optimize a filter for an array in a JSON column

Question

Here is the microsoft document for how to optimize a json column with out using OpenJson and only using Json_Value: https://learn.microsoft.com/en-us/sql/relational-databases/json/index-json-data?view=sql-server-ver15

My issue is that I have a JSON column that contains an array where I am trying to grab all keys called Test_ID from each element in the array to compare with a joined statements testId and, while it works, its relatively slow. Takes about 9 seconds for 400 rows. I am trying to speed this up exponentially and it seems the only way to do so is through the indexing mentioned in that article but I can not seem to figure out how to do it for an array.

My JSON is similar to this: '{"Property":{"Label":"0"},"Tests":[{"Test_ID":"GUID_HERE","Type":{"Label":" "},"Name":{"Label":" "},"Value":null,{"Test_ID":"GUID_HERE","Type":{"Label":" "},"Name":{"Label":" "},"Value":" "}]}'

Here is my scrubbed query

SELECT DISTINCT w.W_ID, 
       'Proc' ProcHeaderName, p.ProcNumber ProcValue, 
       'Class' ClassHeaderName, p.Class ClassValue 
INTO #Procs
FROM proc p
LEFT JOIN (SELECT wt.W_ID, wt.TestId 
    from TestValue wt where wt.IsDeleted = 0) as wtRow on wtRow.W_ID in (SELECT ID FROM #tmp) 
LEFT JOIN TableNameHere c on c.IsDeleted = 0 and c.col_ID in (SELECT col_ID FROM tmp)
WHERE p.IsDeleted = 0 and [dbo].[GetTestIdJson](c.Json, wtRow.TestId) = wtRow.TestId
    AND p.ProcNumber + ',' + p.RNumber = JSON_VALUE(c.Json,'$.Property.Label') + ',' + JSON_VALUE(c.Json,'$.Property.Label')
GROUP BY wtRow.W_ID, p.ProcNumber, p.Class

For performance related questions we need to see your execution plan. — Dale K, Jun 30 '21 at 03:51
Consider that scalar function dbo.GetTestIdJson() has to be invoked for every row in the joined data set and likely won't appear with real costs in any plan estimates. — AlwaysLearning, Jun 30 '21 at 04:35

score 0 · Accepted Answer · answered Jun 30 '21 at 08:37

… indexed view … fiddle

create table dbo.a (id int primary key, json nvarchar(max));   
insert into dbo.a values(1, '{"Property":{"Label":"0"},"Tests":[{"Test_ID":"GUID_HERE1","Type":{"Label":" "},"Name":{"Label":" "},"Value":null},{"Test_ID":"GUID_HERE2","Type":{"Label":" "},"Name":{"Label":" "},"Value":" "}]}');
insert into dbo.a values(2, '{"Property":{"Label":"0"},"Tests":[{"Test_ID":"GUID_HERE21","Type":{"Label":" "},"Name":{"Label":" "},"Value":null},{"Test_ID":"GUID_HERE22","Type":{"Label":" "},"Name":{"Label":" "},"Value":" "}]}');
GO
--numbers table
create table dbo.n(n int primary key);
insert into dbo.n values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10); --assume max 11 elements in Tests[]
GO
create view dbo.v
with schemabinding
as
select a.id, n.n, json_value(a.json, concat('$.Tests[', n.n,'].Test_ID')) as Test_Id
from dbo.a as a
cross join dbo.n as n
where json_value(a.json, concat('$.Tests[', n.n,'].Test_ID')) is not null;
GO
create unique clustered index vuidx on dbo.v(id,n);
create index idTestId on dbo.v(Test_Id);
GO
select * from dbo.v
GO
set statistics xml on;
select *
from dbo.v with(noexpand)
where Test_Id = 'GUID_HERE2';
GO
drop view if exists dbo.v;
GO
drop table if exists dbo.n;
GO
drop table if exists dbo.a;

This worked. Thank you! Will note that I had to add COUNT_BIG(*) as countN to the select statement or it errored out when trying to add the clustered index — Jonathan Rostami, Jun 30 '21 at 16:28

Create a virtual column and index to optimize a filter for an array in a JSON column

1 Answers1