3

I am running into some performance issue and I have a query as follow:

SELECT * FROM Foo

UNION

SELECT * FROM Boo

UNION

SELECT * FROM Koo

I know for sure that Koo is not returning any duplicates. I was considering to use UNION ALL at the end and therefore save the sorting and distinct selection time for Koo. The query should look like:

SELECT * FROM Foo

UNION

SELECT * FROM Boo

UNION ALL

SELECT * FROM Koo

Will that help or it will be affected bu the first UNION?

Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509
Moslem Ben Dhaou
  • 6,897
  • 8
  • 62
  • 93
  • 1
    Test both approaches - look at the query plans and see if there is _any_ difference. – Oded Dec 11 '12 at 10:47
  • @Oded yes, but only if your test data and your server configuration are represenstive. Otherwise you may make false assumptions based on (non)optimaztions associated with your test setup. i.e. examining query plans on your 10 record, single core, low RAM setup is not much use if you intend to have 10,000,000 record, 32 cores and lots of RAM. – Jodrell Dec 11 '12 at 10:58
  • 1
    Note that it is not just that Koo should have no duplicates, it is that Koo should not contain any rows that are in Foo or in Boo. – Lord Peter Dec 11 '12 at 13:05

1 Answers1

3

Always use UNION ALL if you know that there will not be duplicates.

It's a bit gray here, but still worthwhile - although practically marginal.

If it's a straight UNION-UNION, SQL Server can optimize this to collate all 3 result sets and perform a single sort across both. Since sorting is mostly O(n log n), it amounts to very little difference between that and [(A distinct B) add C].

update

Although it is possible to perform a single merge-sort, SQL Server doesn't seem to do it (at least not always) - so the wisdom of using UNION ALL is definitely worth it here. Compare the plans here: SQLFiddle (click on the "View Execution Plan" links)

RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
  • If `Koo` is a big table, I'd say that there is a lot of difference in using `UNION ALL` to concatenate `Koo`... So I'm not sure if it's really marginal – Lukas Eder Dec 11 '12 at 11:02
  • If you know somthing useful about the data, it makes sense to tell the query engine about it. It will probably help when producing the query plan. – Jodrell Dec 11 '12 at 11:04
  • @Lukas If SQL Server optimized the two sorts into a single operation, then it becomes concat-concat-sort vs concat-sort-concat, where the O(n log n) nature of the sorts makes the difference less pronounced. Since SQL Server doesn't [always?] optimize it, then it's definitely much faster than concat-sort-concat-sort. – RichardTheKiwi Dec 11 '12 at 11:13