Postgres consistently favoring nested loop join over merge join

Question

I have a complex query:

SELECT DISTINCT ON (delivery.id)
delivery.id, dl_processing.pid
FROM mailer.mailer_message_recipient_rel AS delivery
    JOIN mailer.mailer_message AS message ON delivery.message_id = message.id
    JOIN mailer.mailer_message_recipient_rel_log AS dl_processing ON dl_processing.rel_id = delivery.id AND dl_processing.status = 1000
    -- LEFT JOIN mailer.mailer_recipient AS r ON delivery.email = r.email
    JOIN mailer.mailer_mailing AS mailing ON message.mailing_id = mailing.id
WHERE
    NOT EXISTS (SELECT dl_finished.id FROM mailer.mailer_message_recipient_rel_log AS dl_finished WHERE dl_finished.rel_id = delivery.id AND dl_finished.status <> 1000) AND
    dl_processing.date <= NOW() - (36000 * INTERVAL '1 second') AND
    NOT EXISTS (SELECT ml.id FROM mailer.mailer_message_log AS ml WHERE ml.message_id = message.id) AND
    -- (r.times_bounced < 5 OR r.times_bounced IS NULL) AND
    NOT EXISTS (SELECT ur.id FROM mailer.mailer_unsubscribed_recipient AS ur WHERE ur.email = delivery.email AND ur.list_id = mailing.list_id)
ORDER BY delivery.id, dl_processing.id DESC
LIMIT 1000;

It is running very slowly, and the reason seems to be that Postgres is consistently avoiding using merge joins in its query plan despite me having all the indices that I would need for this. It looks really depressing:

http://explain.depesz.com/s/tVY

Query plan

https://i.stack.imgur.com/Myw4R.png

Why would this happen? How do I troubleshoot such an issue?

UPD: with @wildplasser's help I have reworked the query to fix performance (while changing its semantics somewhat):

SELECT delivery.id, dl_processing.pid
FROM mailer.mailer_message_recipient_rel AS delivery
    JOIN mailer.mailer_message AS message ON delivery.message_id = message.id
    JOIN mailer.mailer_message_recipient_rel_log AS dl_processing ON dl_processing.rel_id = delivery.id AND dl_processing.status in (1000, 2, 5) AND dl_processing.date <= NOW() - (36000 * INTERVAL '1 second')
    LEFT JOIN mailer.mailer_recipient AS r ON delivery.email = r.email
WHERE
    (r.times_bounced < 5 OR r.times_bounced IS NULL) AND
    NOT EXISTS (SELECT dl_other.id FROM mailer.mailer_message_recipient_rel_log AS dl_other WHERE dl_other.rel_id = delivery.id AND dl_other.id > dl_processing.id) AND
    NOT EXISTS (SELECT ml.id FROM mailer.mailer_message_log AS ml WHERE ml.message_id = message.id) AND
    NOT EXISTS (SELECT ur.id FROM mailer.mailer_unsubscribed_recipient AS ur JOIN mailer.mailer_mailing AS mailing ON message.mailing_id = mailing.id WHERE ur.email = delivery.email AND ur.list_id = mailing.list_id)
ORDER BY delivery.id
LIMIT 1000

It now runs well, but the query plan still sports these horrible nested loop joins <_<:

http://explain.depesz.com/s/MTo3

I would still like to know why that is.

Could you also post the text version of the execution plan? (ideally as `explain analyze`) e.g uploaded to http://explain.depesz.com — , Jun 04 '14 at 09:47
`SELECT DISTINCT ON (delivery.id) delivery.id, dl_processing.pid` There are only **two** RTEs referenced by the outer query, but it consists of a join of **four** tables. Why? — wildplasser, Jun 04 '14 at 09:56
@wildplasser I don't know, it might have something to do with the `WHERE` clause... — Alexei Averchenko, Jun 04 '14 at 09:58
I presume `delivery.id` is a PK? ::then the distibct makes no sense, and the joined tables can (mostly) be replaced by `EXISTS` constructs. — wildplasser, Jun 04 '14 at 10:04
@wildplasser I've cleaned it up a little, but the cost actually increased, still no merge (anti)-joins: http://pastebin.com/HGMpSN0G Distinct *does* make sense, because there can be many rows satisfying the conditions on `dl_processing`. — Alexei Averchenko, Jun 04 '14 at 10:13
`because there can be many rows satisfying the conditions on dl_processing` ... of which you only want one. Which one? the most recent one? the oldest one? — wildplasser, Jun 04 '14 at 10:17
@wildplasser The most recent one, yes. Ideally I should also make sure it's not overlapped by any rows from what used to be dl_finished, but right now I don't care, I just want the query to run in linearithmic time. — Alexei Averchenko, Jun 04 '14 at 10:18
@wildplasser I used your ideas to improve the query, but it still refuses to do merge joins! — Alexei Averchenko, Jun 04 '14 at 10:34
On SQL Server I would suspect that the indexes that you have are not suitable for merge joining (although you think they are). I would look what columns are being accessed from the joined tables. Maybe there is a column access that cannot be satisfied from your indexes that you intend to use for merge joins. — usr, Jun 04 '14 at 10:48
Did you analyze the single queries themselves? Do they use indexes? If not you can simplify your problem by solving them one after the other. If they won't use indexes themselves, pose the definition of the index creation here (the CREATE INDEX command used.) — Angelo Fuchs, Jun 04 '14 at 12:44
@wildplasser the `mailer_recipient` join cannot be factored out, really, certainly not in a way that will make the code simpler and the query plan better; I tried factoring the `mailer_message` join into the `NOT EXISTS` subquery where it's used, but strangely enough it made the actual performance worse despite decreasing the estimated cost. Anyway, thank you for your help: while the question still stands, for me it's now academic, as the query that I posted has much improved performance. — Alexei Averchenko, Jun 04 '14 at 13:25
@wildplasser I just realized I have mistakenly put a condition that should've been in the `WHERE` clause to the `JOIN... ON` clause, that's probably what you were referring to. — Alexei Averchenko, Jun 04 '14 at 13:39

Alexei Averchenko · Accepted Answer · 2014-06-04T16:33:53.753

The reason is that Postgres is actually doing the right thing, and I suck at math. Suppose table A has N rows, and table B has M rows, and they are being joined via a column that they both have a B-tree index for. Then the following is true:

Nested loop join's time complexity is not O(MN), like I naively thought, but O(M log N) or O(N log M), depending on which table is scanned linearly. If both are scanned by an index, we get O(M log M log N) or O(N log M log N), respectively. But since this is only required if a specific order of the rows is needed for yet another join or due to the ORDER clause, as we'll see it's not a bad deal at all.
Merge join's time complexity is O(M log M + N log N), which means that it loses to the nested loop join, provided that the asymptotic proportionality coefficients are the same, and AFAIK they should both be equal to 1 in most implementations. Since both tables must be iterated by the same index in the same direction, if different order is required, an additional sort is required, which easily makes the complexity worse than in the case of the nested loop sort.

So basically despite being associated with the merge sort, which we all love, merge join almost always sucks.

The reason why my first query was so slow was because it had to perform sort before applying limit, and it was also bad in many other ways. After applying @wildplasser's suggestions, I managed to reduce the number of (still expensive) nested loops and also allow for limit to be taken without a sort, thus ensuring that Postgres most likely won't need to run the outer scan to its completition, which is where I derive the bulk of performance gains from.

If you only want the record with the youngest date, chck my anser here http://stackoverflow.com/a/24043507/905902 on how to obtain that without the `DISTINCT ON(...)` , avoiding the final aggregation. `LIMIT 1000` is (often) also a showstopper. — wildplasser, Jun 04 '14 at 21:37

Postgres consistently favoring nested loop join over merge join

1 Answers1