3

I have these two queries:

SELECT SQL_NO_CACHE DISTINCT(type) FROM actions LIMIT 0, 30;

and

SELECT SQL_NO_CACHE type FROM actions GROUP BY type LIMIT 0, 30;

If I don't use the LIMIT clause, the execution times are equal. On the other hand, in my case, the first query takes almost 0.8 seconds, while the second takes 0.12 seconds.

Using EXPLAIN, it seems that the only difference is that first query uses a temporary table, while the second does not.

At this point I'm quite astonished by the different behavior of the two queries... can you provide some enlightenment on the problem?

I am currently using MYSQL 5.5.37-35.1 Percona Server (GPL), Release 35.1, Revision 666

marcosh
  • 8,780
  • 5
  • 44
  • 74

1 Answers1

2

It seems that the LIMIT optimization is only properly applied with GROUP BY when there is an ORDER BY clause. As Gordon Linoff suggested in an earlier (deleted) answer, the GROUP BY query has an implicit ORDER BY. Therefore, the GROUP BY query uses the LIMIT optimization.

Even though, the DISTINCT query basically uses GROUP BY to solve it, the implicit ORDER BY is not there. Adding an explicit ORDER BY clause to the DISTINCT query yields the same execution plan and the same performance as the GROUP BY query:

SELECT SQL_NO_CACHE DISTINCT(type) FROM actions ORDER BY type LIMIT 0, 30;

and

SELECT SQL_NO_CACHE type FROM actions GROUP BY type LIMIT 0, 30;
Marcus Adams
  • 53,009
  • 9
  • 91
  • 143
  • I absolutely agree. (And I wonder why the answer was deleted :).) But I was wondering why DISTINCT doesn't also implicitly use an ORDER BY when there's an index that would make it fast. And moreover, I tried similar queries to the ones in the question and the end result is in the same order - do you think this is to be expected for some reason? Thanks – nicolagi Jul 02 '14 at 14:44
  • @idrarig, I guess it's the implicit or explicit `ORDER BY` that triggers the `LIMIT` optimization, regardless of what else is going on. Room for improvement to be sure. – Marcus Adams Jul 02 '14 at 15:00