I've searched for some answers on SO but none comes up with what I want to achieve, so here comes another question about top N per group.
DB Structure
Currently I have a Post table as followed:
post_id | post_title | post_text
and a Comment table as followed:
comment_id | post_id | comment_text | comment_date | ...
Goal
- I would like to get top n comments per post ordered by
comment_date
desc with support for pagination. Exactly n comments must be returned for each post if possible, e.g. if a post has 10 comments, 3 should be returned, if a post has 2 comments, only 2 can be returned. - All comment fields should be returned.
- Only
post_id
,post_title
,post_text
need to be returned, but if we can return all fields that would be great. - A post should appear only one time if possible, i.e. if it was in page 1 then in page 2 it would not be taken into account (please see example below). I think this is easier to achieve than allowing a post to appear several times.
Example
I have the following posts:
post_id | post_title | post_text
++++++++++++++++++++++++++++++++
1 | ... | ...
2 | ... | ...
3 | ... | ...
4 | ... | ...
and following comments (ordered by comment_date
desc, date 12
is more recent than date 11
):
comment_id | post_id | comment_date | ...
+++++++++++++++++++++++++++++++++++++++++
12 | 1 | date 12 | ...
11 | 2 | date 11 | ...
10 | 2 | date 10 | ...
9 | 2 | date 9 | ...
8 | 3 | date 8 | ...
7 | 1 | date 7 | ...
6 | 4 | date 6 | ...
5 | 2 | date 5 | ...
4 | 2 | date 4 | ...
3 | 1 | date 3 | ...
2 | 1 | date 2 | ...
1 | 1 | date 1 | ...
Let's say I have a limit of 6 comments per page, and n
is 3. Expected results are:
Page 1:
comment_id | post_id | comment_date | ...
+++++++++++++++++++++++++++++++++++++++++
12 | 1 | date 12 | ...
11 | 2 | date 11 | ...
10 | 2 | date 10 | ...
9 | 2 | date 9 | ...
7 | 1 | date 7 | ...
3 | 1 | date 3 | ...
Page 2
comment_id | post_id | comment_date | ...
+++++++++++++++++++++++++++++++++++++++++
8 | 3 | date 8 | ...
6 | 4 | date 6 | ...
I've tried things similar to this: Top N Per Group with Multiple Table Joins but if we order by comment_date
desc there's no guarantee that there are exactly n results returned.
Any help is greatly appreciated.