1

Why is the "union" query below so much slower than the "subquery" one? Why isn't it using the unique key?

CREATE TABLE slow_union (
  primary_key TEXT PRIMARY KEY,
  unique_key TEXT UNIQUE
);
INSERT INTO slow_union (primary_key, unique_key)
SELECT generate_series::text, generate_series::TEXT FROM generate_series(1,999999);

-- slow, unions don't use the other key
SELECT * FROM slow_union s INNER JOIN
(
 SELECT 'first', primary_key, unique_key FROM slow_union
 UNION
 SELECT 'second', primary_key, unique_key FROM slow_union
) AS unions
USING (primary_key)
WHERE s.unique_key = '1';

-- fast, subqueries do use the other key
SELECT * FROM 
(SELECT 'first', primary_key, unique_key FROM slow_union) AS s1
INNER JOIN
(SELECT 'second', primary_key, unique_key FROM slow_union) AS s2
USING (primary_key)
WHERE s1.unique_key = '1';

Note that you can swap the references to primary_key and unique_key and the performance remains the same (USING unique_key, WHERE primary_key).

If you change the USING clause to USING (primary_key, unique_key) it does use the unique key. This matters for view design where the USING statement is inside the view and the WHERE is outside - is it better practice to always USING all the key fields when joining on the results of a UNION?

  • 1
    You should at least use `UNION ALL`. But I don't think a condition on the primary key will ever be pushed into a union –  Feb 14 '22 at 19:38
  • @a_horse_with_no_name `UNION ALL` worked. If you move this to an answer I'll marked it answered. – MichaelMitchell Feb 14 '22 at 20:44

0 Answers0