Why the planner does not execute joins participating in WHERE clause first?

Question

I'm experimenting with PostgreSQL (v9.3). I have a quite large database, and often I need to execute queries with 8-10 joined tables (as source of large data grids). I'm using Devexpress XPO as the ORM above PostgreSQL, so unfortunately I don't have any control over how joins are generated.

The following example is a fairly simplified one, the real scenario is more complex, but as far as my examination the main problem can be seen on this too.

Consider the following variants of the (semantically) same query:

SELECT o.*, c.*, od.* 
FROM orders o 
LEFT JOIN orderdetails od ON o.details = od.oid
LEFT JOIN customers c ON o.customer = c.oid
WHERE c.code = 32435 and o.date > '2012-01-01';

SELECT o.*, c.*, od.* 
FROM orders o 
LEFT JOIN customers c ON o.customer = c.oid
LEFT JOIN orderdetails od ON o.details = od.oid
WHERE c.code = 32435 and o.date > '2012-01-01';

The orders table contains about 1 million rows, and the customers about 30 thousand. The order details contains the same amount as orders due to a one-to-one relation.

UPDATE: It seems like the example is too simplified to reproduce the issue, because I checked again and in this case the two execution plain is identical. However in my real query where there are much more joins, the problem occures: if I put customers as the first join, the execution is 100x faster. I'll add my real query, but due to the hungarian language and the fact that it's been generated by XPO and Npgsql makes it less readable.

The first query is significantly slower (about 100x) than the second, and when I output the plans with EXPLAIN ANALYZE I can see that the order of the joins reflects to their position in the query string. So firstly the two "giant" tables are joined together, and then after the filtered customer table is joined (where the filter selects only one row).

The second query is faster because the join starts with that one customer row, and after that it joins the 20-30 order details rows.

Unfortunately in my case XPO generates the first version so I'm suffering with performance.

Why PostgreSQL query planner not noticing that the join on customers has a condition in the WHERE clauuse? IMO the correct optimization would be to take those joins first which has any kind of filter, and then take those joins which participate only in selection.

Any kind of help or advice is appreciated.

Why don't you start with the `customers` table instead? Btw having such conditions in the WHERE clause on a `LEFT` joined table is in reality making it an `INNER JOIN` (since NULLs are not returned). Is there any chance to see an output from `EXPLAIN ANALYZE`? Do you have indexes on `c.code`, `o.customer`, `o.details` and `o.date`? — Kouber Saparev, Nov 27 '14 at 16:42
Actually, the number of joins also matter http://www.postgresql.org/docs/current/static/explicit-joins.html -- see `geqo_threshold` & `join_collapse_limit` options http://www.postgresql.org/docs/current/static/runtime-config-query.html -- BTW if your ORM supports some kind of native querying, this would be a perfect place to use them. — pozs, Nov 27 '14 at 16:46
@pozs Thank you, yes it supports direct queries but I wanted to avoid it if possible. I'll have a look at the links you gave. — Zoltán Tamási, Nov 27 '14 at 16:49
@KouberSaparev Thanks, the query is generated by XPO so I don't have a chance to reformat it, except for completely doing it myself. I'll try to post the outputs when I have time to prettify them a bit. For now I'm gonna check the tips from the comment of pozs. — Zoltán Tamási, Nov 27 '14 at 16:50
`However in my real query where there are much more joins, the problem occures:` that is a problem that cannot be solved by looking at your trimmed-down snippet in the question. — wildplasser, Nov 27 '14 at 19:17
An order must always have some details (at least one) and also must have a customer, doesn't it? So why do you use LEFT joins instead of ordinary JOINs ? — krokodilko, Nov 27 '14 at 21:54
Thanks for the comments, I know that my example is not good. About the "why don't I use" type of comments, I wrote several times that queries are generated by the ORM tool Devexpress XPO), so as a downside of using an ORM, I don't have so much control. The problem is the number of joins and the join_collapse_limit setting being too low. When I reaised it to 15, my queries were fast. However, I feel that this is rather a hack than a solution. — Zoltán Tamási, Nov 28 '14 at 09:50
Yest that is a possibility, but the planning time grows exponentially with the number of tables in the range table. (That is the reason for the existence of the collapse limits) Another solution is to move "coherent" parts of yout query into CTEs (if your ORM allows that) — joop, Nov 28 '14 at 10:13
@joop I know that, but in this case it's worth, because the difference is so significant, and this query is (on of) the heart of the application. — Zoltán Tamási, Nov 28 '14 at 11:02
Changing join_collapse_limit is not a hack, it's needed for your specific workload. Using LEFT JOIN's when you need INNER JOIN's is something that should be hacked, this looks like a bug in your ORM. — Frank Heikens, Nov 28 '14 at 18:52
Unfortunately XPO is quite limited to say the least, but this is a large project which is already completely stuck to it, and would be hard to migrate to EF or NHibernate for example. — Zoltán Tamási, Nov 28 '14 at 19:43
Guys, thank you all for the tips. Increasing the join_collapse_limit did the trick for now. I would like to close this, so @pozs (the first who mentioned it) please drop a short answer based on your comment so I can accept it. — Zoltán Tamási, Dec 03 '14 at 14:37

pozs · Accepted Answer · 2014-12-04T12:11:27.653

Join orders only matters, if your query's joins not collapsed. This is done internally by the query planner, but you can manipulate the process with the join_collapse_limit runtime option.

Note however, the query planner will not find every time the best join order by default:

Constraining the planner's search in this way is a useful technique both for reducing planning time and for directing the planner to a good query plan. If the planner chooses a bad join order by default, you can force it to choose a better order via JOIN syntax — assuming that you know of a better order, that is. Experimentation is recommended.

For the best performance, I recommend to use some kind of native querying, if available. Raising the join_collapse_limit can be a good-enough solution though, if you ensure, this hasn't caused other problems.

Also worth to mention, that raising join_collapse_limit will most likely increase the planning time.

Thanks for the complete answer and your thoughts. – Zoltán Tamási Dec 04 '14 at 15:08 — Zoltán Tamási, Dec 04 '14 at 15:08

Why the planner does not execute joins participating in WHERE clause first?

1 Answers1