I have two tables. order_details
which is 100,000 rows, and outbound
which is 10,000 rows.
I need to join them on a column called order_number
, which is a VARCHAR(50) on both. order_number is not unique in the outbound table.
CREATE TABLE `outbound` (
`outbound_id` int(12) NOT NULL,
`order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `order_details` (
`order_details_id` int(12) NOT NULL,
`order_number` varchar(50) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
This is my initial query, and it takes well over 60 seconds to run:
SELECT o.order_number
FROM outbound o
INNER JOIN order_details od
ON o.order_number = od.order_number
This query gets the same results and takes less than a second to run:
SELECT o.order_number
FROM outbound o
INNER JOIN
(
SELECT order_number
FROM order_details
) od
ON (o.order_number = od.order_number)
This is surprising to me because usually sub-queries are significantly slower.
Running EXPLAIN
(which I'm still learning how to understand) shows that the sub query version uses a derived2
table, that it is using an index, and that index is auto_key0
. I'm not savvy enough to know how to interpret this to understand why this makes a significant difference.
I am running these queries over command line.
I am running MySQL Ver 14.14 Distrib 5.6.35, for Linux (x86_64) CentOS.
In summary:
Why is this simple join query significantly quicker with a sub-query?