To find all the changes between two databases, I am left joining the tables on the pk and using a date_modified field to choose the latest record. Will using EXCEPT
increase performance since the tables have the same schema. I would like to rewrite it with an EXCEPT
, but I'm not sure if the implementation for EXCEPT
would out perform a JOIN
in every case. Hopefully someone has a more technical explanation for when to use EXCEPT
.

- 484
- 1
- 5
- 17

- 311
- 1
- 2
- 10
-
Do you need to compare all the columns? Or, is there a single id column that can be used for the join? – Gordon Linoff Feb 04 '13 at 18:37
-
Do recall that the database implementation of queries doesn't really map 1:1 to SQL keywords. It's more than possible that semantically equivalent constructs will translate to the same query plan. – millimoose Feb 04 '13 at 18:42
-
@GordonLinoff I need to compare all the columns in order to update any changes. And I am using a primary key for the join. – Joseph Kiskaden Feb 04 '13 at 18:55
2 Answers
There is no way anyone can tell you that EXCEPT
will always or never out-perform an equivalent OUTER JOIN
. The optimizer will choose an appropriate execution plan regardless of how you write your intent.
That said, here is my guideline:
Use EXCEPT
when at least one of the following is true:
- The query is more readable (this will almost always be true).
- Performance is improved.
And BOTH of the following are true:
- The query produces semantically identical results, and you can demonstrate this through sufficient regression testing, including all edge cases.
- Performance is not degraded (again, in all edge cases, as well as environmental changes such as clearing buffer pool, updating statistics, clearing plan cache, and restarting the service).
It is important to note that it can be a challenge to write an equivalent EXCEPT
query as the JOIN
becomes more complex and/or you are relying on duplicates in part of the columns but not others. Writing a NOT EXISTS
equivalent, while slightly less readable than EXCEPT
should be far more trivial to accomplish - and will often lead to a better plan (but note that I would never say ALWAYS
or NEVER
, except in the way I just did).

- 272,866
- 37
- 466
- 490
-
Other answer seems to be telling the OP that `EXCEPT` is *marginally faster*. And it's gone. – Kermit Feb 04 '13 at 18:39
-
@njk yep, and I have linked to a counter-example. Again, if the OP is looking for `ALWAYS/NEVER`, he's not going to find it. – Aaron Bertrand Feb 04 '13 at 18:41
-
3@njk I deleted my answer because it was too universal... in the tests I ran, `except` was marginally faster... but it wasn't accurate to say that it will usually be faster. Aaron's answer is better because it addresses that there's other factors that need to be considered instead of just always trying to use `except`... – Michael Fredrickson Feb 04 '13 at 18:42
-
@AaronBertrand So in general, will an EXCEPT outperform a LEFT JOIN? I know there are several factors involved but I appreciate any input you may have. Thanks. – Joseph Kiskaden Feb 04 '13 at 18:47
-
1@JosephKiskaden you are not going to get an "in general" answer to that question, sorry. Way too many variables. You should test your ***specific*** scenario (you should be checking semantics anyway, so all that means is adding performance testing to those tests). – Aaron Bertrand Feb 04 '13 at 18:48
-
@AaronBertrand . . . That is a great reference. Any idea why the SQL Server compiler does *not* choose the best execution plan when there are multiple choices that return the same answer but have different performance? I, for one, don't like to have to phrase a query in a particular way to get a particular plan. This is even worse when you work with multiple databases. – Gordon Linoff Feb 04 '13 at 19:11
-
Necro comment, but this is a very useful post. A ***factor*** against the `EXCEPT` mechanism is that even if a MERGE can be used, the scalar operations would be processing all columns rather than just the identifier column(s). This would be increasingly pronounced in a column store, as well as being increasingly relevant when the clustered index of the tables do not match. – MatBailie Jan 23 '18 at 16:58
In the following example, the LEFT JOIN
is faster than EXCEPT
by 70%
(PostgreSQL 9.4.3)
Example:
There are three tables. suppliers
, parts
, shipments
.
We need to get all parts not supplied by any supplier in London.
Database(has indexes on all involved columns):
CREATE TABLE suppliers (
id bigint primary key,
city character varying NOT NULL
);
CREATE TABLE parts (
id bigint primary key,
name character varying NOT NULL,
);
CREATE TABLE shipments (
id bigint primary key,
supplier_id bigint NOT NULL,
part_id bigint NOT NULL
);
Records count:
db=# SELECT COUNT(*) FROM suppliers;
count
---------
1281280
(1 row)
db=# SELECT COUNT(*) FROM parts;
count
---------
1280000
(1 row)
db=# SELECT COUNT(*) FROM shipments;
count
---------
1760161
(1 row)
Query using EXCEPT
.
SELECT parts.*
FROM parts
EXCEPT
SELECT parts.*
FROM parts
LEFT JOIN shipments
ON (parts.id = shipments.part_id)
LEFT JOIN suppliers
ON (shipments.supplier_id = suppliers.id)
WHERE suppliers.city = 'London'
;
-- Execution time: 3327.728 ms
Query using LEFT JOIN
with table, returned by subquery.
SELECT parts.*
FROM parts
LEFT JOIN (
SELECT parts.id
FROM parts
LEFT JOIN shipments
ON (parts.id = shipments.part_id)
LEFT JOIN suppliers
ON (shipments.supplier_id = suppliers.id)
WHERE suppliers.city = 'London'
) AS subquery_tbl
ON (parts.id = subquery_tbl.id)
WHERE subquery_tbl.id IS NULL
;
-- Execution time: 1136.393 ms

- 1,719
- 17
- 12