I have data in table Foo
in columns x1
and x2
and x3
, and more data in table Bar
in x4
. Both tables are primary-keyed by a unique id
, and each Bar
has a foreign key to a single Foo
, as in the schema below:
CREATE TABLE Foo (id INT, x1 INT, x2 INT, x3 INT, ...)
CREATE TABLE Bar (id INT, fooId INT, x4 INT, ...)
CREATE TABLE Qux (x INT, ...)
This is the right schema, and it's properly normalized for its use case.
I need a set of all distinct x
values from Foo
and Bar
where the Foo
records match some WHERE
clause. I then need to use those values to look up the correct records in another table Qux
.
I had solved it with UNION ALL
, as in the example below:
WITH CTE_Ids AS (
SELECT x1 AS x FROM Foo WHERE ...
UNION ALL SELECT x2 AS x FROM Foo WHERE ...
UNION ALL SELECT x3 AS x FROM Foo WHERE ...
UNION ALL SELECT x4 AS x FROM Foo f LEFT OUTER JOIN Bar b ON f.id = b.fooId WHERE ...
),
CTE_UniqueIds AS (
SELECT DISTINCT x FROM CTE_Ids
)
SELECT q.*
FROM CTE_UniqueIds ids
INNER JOIN Qux q ON ids.x = q.x
This produces the right result set, and I don't mind repeating the WHERE
clause in the code — but unfortunately, it's very inefficient, scanning the Foo
and Bar
tables many times, because SQL Server doesn't realize it could scan the data exactly once. The inefficient query plan is bad enough that we're experiencing significant slowdowns in our production software.
So how can I get the unique set of x
values unioned across columns without SQL Server scanning each table multiple times?