Imagine I'm designing a multi-user blog and I have user
, post
, and comment
tables with the obvious meanings. On the main page, I want to show the ten most recent posts along with all their related comments.
The naive approach would be to SELECT
the ten most recent post
s, probably JOIN
ed with the user
s that wrote them. And then I can loop through them to SELECT
the comments, again, probably JOIN
ed with the users that wrote them. This would require 11 selects: 1 for the posts and 10 for their comments, hence the name of the famous anti-pattern: n+1 selects.
The usual advice for avoiding this anti-pattern is to use the IDs from the first query to fetch all related comments in a second query which may look something like this:
SELECT
*
FROM
comments
WHERE
post_id IN (/* A comma separated list of post IDs returned from the first query */)
As long as that comma separated list is in reasonably short we managed to fetch all the data we need by issuing only two SELECT
queries instead of eleven. Great.
But what if I only want the top three comments for each post? I didn't try but I can probably come up with some LEFT JOIN
trickery to fetch the most recent posts along with their top three comments in a single query but I'm not sure it would be scalable. What if I want the top hundred comments which would exceed the join limit of 61 tables of a typical MySQL installation for instance?
What is the usual solution for this other than reverting to n+1 selects anti-pattern? What is the most efficient way to fetch items with a subset of items related to each one in this fairly typical scenario?