This may be a JPQL question or a SQL question, but since it came from a reading of a Criteria example in the JPA Spec. I'm tagging it JPA and Criteria API too. I think the problem is that I don't understand how SQL correlated subqueries work.
Let me begin with JPA Spec. section 6.5.12, Subqueries, Example 4: A Special Case, which says,
Note that joins involving the derived subquery root do not affect the join conditions of the containing query. The following two query definitions thus differ in semantics:
CriteriaQuery<Order> q = cb.createQuery(Order.class);
Root<Order> order = q.from(Order.class);
Subquery<Integer> sq = q.subquery(Integer.class);
Root<Order> orderSub = sq.correlate(order);
Join<Order,Customer> customer = orderSub.join(Order_.customer);
Join<Customer,Account> account = customer.join(Customer_.accounts);
sq.select(account.get(Account_.balance));
q.where(cb.lt(cb.literal(10000), cb.all(sq)));
and
CriteriaQuery<Order> q = cb.createQuery(Order.class);
Root<Order> order = q.from(Order.class);
Join<Order,Customer> customer = order.join(Order_.customer);
Subquery<Integer> sq = q.subquery(Integer.class);
Join<Order,Customer> customerSub = sq.correlate(customer);
Join<Customer,Account> account = customerSub.join(Customer_.accounts);
sq.select(account.get(Account_.balance));
q.where(cb.lt(cb.literal(10000), cb.all(sq)));
The first of these queries will return orders that are not associated with customers, whereas the second will not. The corresponding Java Persistence query language queries are the following:
SELECT o
FROM Order o
WHERE 10000 < ALL (
SELECT a.balance
FROM o.customer c JOIN c.accounts a)
and
SELECT o
FROM Order o JOIN o.customer c
WHERE 10000 < ALL (
SELECT a.balance
FROM c.accounts a)
The second query seems straightforward but the first is difficult to visualize.
My understanding of a correlated subquery is that it depends on the outer query; it uses the data of the outer query. It is not independent of the outer query. It executes once for each selected row from the outer query.
In both queries Orders are being selected in the outer query. In the first query, for each selected Order from the outer query the following subquery is run:
SELECT a.balance FROM o.customer c JOIN c.accounts a
Intuition tells me that the customer associated with the Order record is joined with account. Hence the subquery will return the balances of all accounts associated with the particular customer who is associated with the current Order record.
Am I right or wrong? It seems I'm wrong, according to the above example, which, again, says,
"Note that joins involving the derived subquery root do not affect the join conditions of the containing query,"
and, "The first of these queries will return Orders that are not associated with customers..."
So the subquery does not return account balances of the customer associated with the Order? What account balances are instead returned? What's the right way to see this? My head is spinning.