Which is better: Distinct or Group By

Question

Which is more efficient?

SELECT  theField
FROM    theTable
GROUP BY theField

or

SELECT  DISTINCT theField
FROM    theTable

score 36 · Accepted Answer · answered Dec 17 '08 at 16:35

36

In your example, both queries will generate the same execution plan so their performance will be the same.

However, they both have their own purpose. To make your code easier to understand, you should use distinct to eliminate duplicate rows and group by to apply aggregate operators (sum, count, max, ...).

answered Dec 17 '08 at 16:35

dub

1,396
12
22

7

I would add that adding distinct willy nilly to remove duplicate rows can be bad idea especially if it's use disguises a cross join or missing items in a where clause. Distinct would be my last choice to remove rows. – HLGEM Dec 17 '08 at 16:55
@HLGEM, when would you use Distinct? – wcm Dec 17 '08 at 17:32
@wcm, Imagine a Order table that has a Product field which contains the type of product ordered. You want to know what are all the different products that have been ordered. A simple select will return duplicates if the same product has been ordered multiple times. Use select distinct instead. – dub Dec 17 '08 at 17:42
@dub, I was asking @HGLEM under what conditions he would use DISTINCT since he is saying that it would be his last choice to remove duplicate rows when it is the only time I even consider it. – wcm Dec 17 '08 at 22:44
1

@wcm I understand HLGem's point. Consider a query to return all products purchased by a specific set of customers. INNER JOIN products to sales to customers gives duplicate rows, which could be eliminated by DISTINCT. Using WHERE EXISTS to prevent row duplication should reduce server workload. – Meff Dec 21 '08 at 12:42

score 11 · Answer 2 · answered Dec 17 '08 at 15:54

Doesn't matter, it results in the same execution plan. (at least for these queries). These kind of questions are easy to solve, by enabling query analyzer or SSMS to show the execution plan and perhaps the server trace statistics after running the query.

score 6 · Answer 3 · answered Dec 17 '08 at 15:54

6

In most cases, DISTINCT and GROUP BY generate the same plans, and their performance is usually identical

answered Dec 17 '08 at 15:54

Greg Dean

29,221
14
67
78

score 5 · Answer 4 · answered Dec 17 '08 at 15:54

5

You can check the Execution Plan to look for the total cost of this statements. The answer may vary in different scenarios.

answered Dec 17 '08 at 15:54

Ken Yao

1,506
1
13
24

score 2 · Answer 5 · answered Dec 17 '08 at 16:00

2

Hmmm...so far as I can see in the Execution Plan for running similar queries, they are identical.

answered Dec 17 '08 at 16:00

Dana

32,083
17
62
73

score 0 · Answer 6 · answered Feb 14 '17 at 16:01

0

In MySQL, DISTINCT seems a bit faster than GROUP BY if theField is not indexed. DISTINCT only eliminate duplicate rows but GROUP BY seems to sort them in addition.

answered Feb 14 '17 at 16:01

auntyellow

2,423
2
20
47

Which is better: Distinct or Group By

6 Answers6

Linked