2
EXPLAIN EXTENDED SELECT id, name
FROM member
INNER JOIN group_assoc ON ( member.id = group_assoc.member_id
AND group_assoc.group_id =2 ) 
ORDER BY registered DESC 
LIMIT 0 , 1

Outputs:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   SIMPLE  group_assoc ref member_id,group_id  group_id    4   const   3   100.00  Using temporary; Using filesort
1   SIMPLE  member  eq_ref  PRIMARY PRIMARY 4   source_member.group_assoc.member_id 1   100.00   

explain extended SELECT
                        id, name
                    FROM member WHERE
                        id
                    NOT IN (
                        SELECT
                            member_id
                        FROM group_assoc WHERE group_id = 2 
                    )               
                    ORDER BY registered DESC LIMIT 0,1

Outputs:

id  select_type table   type    possible_keys   key key_len ref rows    filtered    Extra
1   PRIMARY member  ALL NULL    NULL    NULL    NULL    2635    100.00  Using where; Using filesort
2   DEPENDENT SUBQUERY  group_assoc index_subquery  member_id,group_id  member_id   8   func,const  1   100.00  Using index; Using where

The first query I'm not so sure about, it uses a temporary table which seems like a worse idea. But I also see that it uses fewer rows than the 2nd query....

Ben
  • 60,438
  • 111
  • 314
  • 488

2 Answers2

1

These queries return completely different resultsets: the first one returns members of group 2, the second one returns everybody who is not a member of group 2.

If you meant this:

SELECT  id, name
FROM    member
LEFT JOIN
        group_assoc
ON      member.id = group_assoc.member_id
        AND group_assoc.group_id = 2
WHERE   group_assoc.member_id IS NULL
ORDER BY
        registered DESC 
LIMIT 0, 1

, then the plans should be identical.

You may find this article interesting:

Create an index on member.registered to get rid of both filesort and temporary.

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
0

I would say the first is better. The temporary table might not be a good idea, but a subquery isn't much better. And you will give MySQL more options to optimize the query plan with an inner join than you have with a subquery.

The subquery solution is fast as long as there are just a few rows that will be returned.

But... the first and second query don't seem to be the same, should it be that way?

Wolph
  • 78,177
  • 11
  • 137
  • 148
  • Why do you think subquery is no better than a filesort? – Quassnoi Nov 10 '10 at 01:11
  • @Quassnoi: because, from my experience, MySQL is _really_ bad at performing subqueries, the performance drops enormously. Although this has improved a lot since they were first introduced in `4.1`, it's still not up to par to most other databases. – Wolph Nov 10 '10 at 01:14
  • @WoLpH: ah. Probably it has something to do with improper indexing, doesn't it? – Quassnoi Nov 10 '10 at 01:16
  • @Quassnoi: no, with identical indexes most complex (opposed to a simple `not in (select id from table)`) queries get no optimisations at all. In a lot of cases MySQL will simply execute the subquery, store the results and query on that in the parent table. With a few thousand rows that is still a feasible solution, with a couple of millions it will bring your database to a grinding halt. – Wolph Nov 10 '10 at 01:20
  • @WoLpH: `MySQL` never stores results of a subquery. It always pushes the correlated condition into the subquery, executes it (using the same optimizations as it would use on a plain query) and returns on the first match. Are you sure you are not confusing subqueries (`IN (SELECT …)`, `EXISTS (SELECT …)` and `= (SELECT …)`) with inline views (`SELECT FROM (SELECT …)`)? – Quassnoi Nov 10 '10 at 01:29
  • @Quassnoi: what you call an `inline view` is still a subquery. But yes, I was talking about those :) – Wolph Nov 10 '10 at 09:06