3

I know UNION removes duplicates but it changes result order even when there are no duplicates.

I have two select statements, no order by statement anywhere

I want union them with or without (all)

i.e.

SELECT A 
UNION (all) 
SELECT B

"Select B" actually contains nothing, no entry will be returned

if I use "Select A union Select B", the order of the result is different from just "Select A"

if I use:

SELECT A 
UNION ALL 
SELECT B

the order of the result is the same as "Select A" itself and there are no duplicates in "Select A" at all.

Why is this? it is unpredictable.

Prashant Pimpale
  • 10,349
  • 9
  • 44
  • 84
Li Tian Gong
  • 393
  • 1
  • 6
  • 16

2 Answers2

7

The only way to get a particular order of results from an SQL query is to use an ORDER BY clause. Anything else is just relying on coincidence and the particular (transitory) state of the server at the time you issue your query.

So if you want/need a particular order, use an ORDER BY.


As to why it changes the ordering of results - first, UNION (without ALL) guarantees to remove all duplicates from the result - not just duplicates arising from the different queries - so if the first query returns duplicate rows and the second query returns no rows, UNION still has to eliminate them.

One common, easy way to determine whether you have duplicates in a bag of results is to sort those results (in whatever sort order is most convenient to the system) - in this way, duplicates end up next to each other and so you can then just iterate over these sorted results and if(results[index] == results[index-1]) skip;.

So, you'll commonly find that the results of a UNION (without ALL) query have been sorted - in some arbitrary order. But, to re-emphasise the original point, what ordering was applied is not defined, and certainly shouldn't be relied upon - any patches to the software, changes in indexes or statistics may result in the system choosing a different sort order the next time the query is executed - unless there's an ORDER BY clause.

Damien_The_Unbeliever
  • 234,701
  • 27
  • 340
  • 448
3

One of the most important points to understand about SQL is that a table has no guaranteed order, because a table is supposed to represent a set (or multiset if it has duplicates), and a set has no order. This means that when you query a table without specifying an ORDER BY clause, the query returns a table result, and SQL Server is free to return the rows in the output in any order. If the results happen to be ordered, it may be due to optimization reasons. The point I'm trying to make is that any order of the rows in the output is considered valid, and no specific order is guaranteed. The only way for you to guarantee that the rows in the result are sorted is to explicitly specify an ORDER BY clause.

Bill
  • 11,595
  • 6
  • 44
  • 52