2

I have a question regarding lateral joins in Postgres.

My use case is I want to return a dataset that combines multiple tables but limits the number of publications and reviews returned. The simplified table schema is below

Table Author

  • ID
  • NAME

Table Review

  • ID
  • AUTHOR_ID
  • PUBLICATION_ID
  • CONTENT

Table Publication

  • ID
  • NAME

Table AuthorPublication

  • AUTHOR_ID
  • PUBLICATION_ID

So for my initial query I have this:

SELECT
  a.id,
  a.name
    json_agg (
      json_build_object (
        'id', r.id,
        'content', r.content 
       )
    ) AS reviews,
    json_agg (
      json_build_object(
        'id', p.id, 
        'name', p.name
        )
    ) AS publications
FROM
  public.author a
INNER JOIN
  public.review r ON r.author_id = a.id
INNER JOIN
  public.author_publication ap ON ap.author_id = a.id 
INNER JOIN 
  public.publication p ON p.id = ap.publication_id
WHERE 
  a.id = '1'
GROUP BY
  a.id

This returns the data I need, for example I get the author's name, id and a list of all of their reviews and publications they belong to. What I want to be able to do is limit the number of reviews and publications. For example return 5 reviews, and 3 publications.

I tried doing this with a lateral query but am running into an issue where if I do a single lateral query it works as intended.

so like:

INNER JOIN LATERAL
 (SELECT r.* FROM public.review r WHERE r.author_id = a.id LIMIT 5) r ON TRUE

This returns the dataset with only 5 reviews - but if I add a second lateral query

INNER JOIN LATERAL
 (SELECT ap.* FROM public.author_publication ap WHERE ap.author_id = a.id LIMIT 5) r ON TRUE

I now get 25 results for both reviews and publications with repeated/duplicated data.

So my question is are you allowed to have multiple lateral joins in a single PG query and if not what is a good way to go about limiting the number of results from a JOIN?

Thanks!

Martin Valgur
  • 5,793
  • 1
  • 33
  • 45
KukicAdo
  • 43
  • 1
  • 4

1 Answers1

3

You must change your query to something like this:

SELECT
  a.id,
  a.name,
  (
    SELECT
        json_agg ( r )
      FROM (
             SELECT 
                  json_build_object (
                      'id', r.id,
                      'content', r.content 
                  ) AS r
               FROM public.review r 
              WHERE r.author_id = a.id
              ORDER BY r.id
              LIMIT 5
           ) AS a
  ) AS reviews,
  (
   SELECT
        json_agg (p)
     FROM (
            SELECT
                 json_build_object(
                    'id', p.id, 
                     'name', p.name
                 ) AS p
              FROM public.author_publication ap 
             INNER JOIN public.publication p ON p.id = ap.publication_id
             WHERE ap.author_id = a.id
             ORDER BY p.id
             LIMIT 3
        ) AS a
   ) AS publications
FROM
  public.author a
WHERE 
  a.id = '1'
Roman Tkachuk
  • 3,096
  • 1
  • 16
  • 15
  • When I update my query to what you suggested I get the following error: ``` column "r.id" must appear in the GROUP BY clause or be used in an aggregate function ``` and adding a GROUP BY gives me: ``` [Err] ERROR: more than one row returned by a subquery used as an expression ``` But if I just limit to 1 then I get just the one result. – KukicAdo Feb 12 '17 at 23:34
  • Another interesting thing I'm finding out is running the following query: ```SELECT json_agg(r.uuid) FROM public.review r WHERE r.author_uuid = 'b3113a70-e4bf-4b30-a9ca-9b0530a3115b' LIMIT 1``` Creates creates a json array with all matching fields. But if I remove the json_agg I only get 1 row with the first field. – KukicAdo Feb 13 '17 at 00:02
  • Sorry, it is something unconvenient to write query without db struct. When you ask next question - please provide CREATE TABLE statement, some typical data and expected output. – Roman Tkachuk Feb 13 '17 at 00:11
  • Hi Roman - the update you made works perfectly! Thank you so much for your help. This is exactly what I needed. :) – KukicAdo Feb 13 '17 at 00:12