The documentation (here) attempts to explain that if you start with a correlated subquery
inside the select list
then it IS possible to refer to the table in the from clause
e.g.:
SELECT
salesperson.name,
-- find maximum sale size for this salesperson
(SELECT MAX(amount) AS amount
FROM all_sales
WHERE all_sales.salesperson_id = salesperson.id)
AS amount
FROM
salesperson;
In the query above you can see that the from table is salesperson but in the correlated subquery
you CAN also refer to salesperson. The correlation
is that the subquery is being executed row by row through the resultset, with each iteration of that subquery using the salesperson.id as input to the where clause. (By the way, this "row by row" effect is often why correlated subqueries
can be a cause of slowness in queries.)
However it is not permitted in traditional derived tables
, found in the from clause
, to use a correlation
like the one seen above. So the example below will fail because it tries to use a correlation
to salesperson that isn't allowed:
SELECT
salesperson.name,
max_sale.amount,
max_sale_customer.customer_name
FROM
salesperson,
-- calculate maximum size, cache it in transient derived table max_sale
(SELECT MAX(amount) AS amount
FROM all_sales
WHERE all_sales.salesperson_id = salesperson.id)
AS max_sale
AS max_sale_customer;
As the documentation states this will result in an error:
ERROR 1054 (42S22): Unknown column 'salesperson.id' in 'where clause'
BUT, with the addition of support for LATERAL
it is possible to to replace correlated subqueries
that we once used in the select clause
with lateral subqueries
as these DO support the ability to refer to other tables referenced in the from clause
. Additionally (which can't be done in a select clause
) one lateral subquery
can refer to the result of any prior lateral subquery
which is a really very useful feature, and this is also shown by the example in the documentation:
SELECT
salesperson.name,
max_sale.amount,
max_sale_customer.customer_name
FROM
salesperson,
-- calculate maximum size, cache it in transient derived table max_sale
LATERAL
(SELECT MAX(amount) AS amount
FROM all_sales
WHERE all_sales.salesperson_id = salesperson.id)
AS max_sale,
-- find customer, reusing cached maximum size
LATERAL
(SELECT customer_name
FROM all_sales
WHERE all_sales.salesperson_id = salesperson.id
AND all_sales.amount =
-- the cached maximum size
max_sale.amount) -- refers to prior lateral subquery result
AS max_sale_customer;
So, conventional derived tables cannot use correlations, lateral subqueries can.