I'd say this is a bug.
To reproduce:
create table foo(id int auto_increment primary key, a int);
insert into foo(a) values(1), (1), (2);
When you do an explain
and a show warnings
(in MySQL <= 5.6 you have to do explain extended
) you can see what MySQL is actually executing after the optimizer did his job:
mysql > explain select * from foo where id not in (select id from foo group by a);
+----+--------------------+-------+------------+-----------------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+------------+-----------------+---------------+---------+---------+------+------+----------+-------------+
| 1 | PRIMARY | foo | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Using where |
| 2 | DEPENDENT SUBQUERY | foo | NULL | unique_subquery | PRIMARY | PRIMARY | 4 | func | 1 | 100.00 | NULL |
+----+--------------------+-------+------------+-----------------+---------------+---------+---------+------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
mysql > show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Note | 1003 | /* select#1 */ select `playground`.`foo`.`id` AS `id`,`playground`.`foo`.`a` AS `a` from `playground`.`foo` where (not(<in_optimizer>(`playground`.`foo`.`id`,<exists>(<primary_index_lookup>(<cache>(`playground`.`foo`.`id`) in foo on PRIMARY))))) |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
As you can see this is a totally different query. No more grouping. To get the result you'd expect, you have to nest the query one more time like this:
select * from foo where id not in (
select id from (
select any_value(id) as id from foo group by a
) sq
);