Some years ago Performance differences between equal (=) and IN with one literal value there was discussion around IN and EQUAL for a single value.
My issue is where there is a (small) list of values for a select. The use case is to load data into an analytics model for a selection of 'instances'.
SELECT *
FROM [v_location_fact]
where instanceID = '136'
This returns 22,425 rows in 1.4 seconds.
The execution plan:https://www.brentozar.com/pastetheplan/?id=r1ouDwny9
The same result and query works for instanceID = '137'
SELECT *
FROM [v_location_fact]
where instanceID = '136'
UNION ALL
SELECT *
FROM [v_location_fact]
where instanceID = '137'
Works as expected and returns the combined set 45167 rows in 2.5 seconds. This uses essentially the same plan as above, but twice.
But
SELECT *
FROM [v_Location_fact]
where instanceID in ('136', '137')
Takes 33 seconds to return the same data.
The execution plan: https://www.brentozar.com/pastetheplan/?id=SJ9OOw3Jc
With inner join:
declare @ids table (selID varchar(20));
insert into @ids(selID) values ('136'),('137');
SELECT *
FROM [v_Location_fact]
INNER JOIN @ids ON instanceID = selID
also takes 30 seconds to return the same data. Execution plan: https://www.brentozar.com/pastetheplan/?id=BJ4uFw2yq
Each table used by the view v_location_fact has instanceID as the first element of an index.
In our use case, there may be up to 10 instances in the list of selections so the UNION syntax doesn't work from PowerQuery as this would require a variable number of UNION clauses.
Any suggestions how the 'straight' line performance of the UNION (2.5 seconds) can be emulated with a IN or JOIN clause with the small number of instances? (In this example taking 12 times longer, but in more real-world examples taking hundreds of times longer where the tables are much larger).