I'm currently using MySQL 5.6.10.
My actual query is more complicated, but here is a simple way to reproduce the problem. I know the query below is useless (select id from x where id in (select id from x...)), but it proves my point.
I created this table:
CREATE TABLE test (
id INT NOT NULL AUTO_INCREMENT ,
PRIMARY KEY (id));
Then ran this command 5 times - it created 50 rows in the table:
INSERT INTO test (id) VALUES(null),(null),(null),(null),(null),(null),(null),(null),(null),(null);
And then ran this explain:
EXPLAIN SELECT id FROM test WHERE
id in (SELECT id FROM test WHERE id < 5);
And got this:
Which makes perfect sense to me. But then if I add an OR to the WHERE clause with another IN, like this:
EXPLAIN SELECT id FROM test WHERE
id IN (SELECT id FROM test WHERE id < 5)
OR id IN (SELECT id FROM test WHERE id > 45);
suddenly MySQL is looking at all 50 rows:
I know that the query could be re-written as SELECT id FROM test WHERE id < 5 OR id > 45
, or a UNION etc, again, that's not the point. The point is MySQL is examining far too many rows.
If I run a FLUSH STATUS / SHOW STATUS LIKE "Handler%" on the first query, this is what I get:
Handler_read_key 5
Handler_external_lock 4
Handler_read_next 4
Handler_read_first 1
But if I do that to the second query, I get:
Handler_read_key 99
Handler_write 9
Handler_external_lock 6
Handler_read_next 59
Handler_read_first 2
Why the big difference? I wonder if it is the optimizer, and if so, is there some option I can include in the query that will prevent this "optimization"? This has real practical implications for a query I'm developing. Instead of examining only a few hundred rows, MySQL is examining 120,000.