1

I've been doing some research on how to set up a new GraphQL API project, but am running into some basic conceptual? problems in trying to find out how to do pagination and nested database queries efficiently.

I'd appreciate any pointers or advice!

Let's say we get a graphql query like so:

articles(limit: 10) {
  title
  content
  comments(limit: 5) {
    postedAt
    text
  }
}

A typical ORM, assuming eager loading of the nested type, could translate this type of query into an sql query like this, and then loop over the results to manually group the comments together and hydrate it all.

select a.title, a.content, c.posted_at, c.text
from articles as a
left join comments as c on c.article_id = a.id
limit ???

But so far, I've only ever seen ORMs like Doctrine (php) and Sequelize (js) fail in doing pagination correctly in these cases. They can't correctly handle page sizes, because there's no way to express the limit in this sql query's setup.

  • => Am I correct in seeing this problem? Or am I missing something crucial, are ORMs able to do pagination with eagerly loaded data somehow?

So now I just recently came across the lateral join type in Postgres, which seems to solve this issue, provided we also add some json trickery:

select a.title, a.content, t.data as comments
from articles as a
join lateral (
  select json_agg(sub.*) as data
  from (
    select c.posted_at, c.text
    from comments as c
    where c.article_id = a.id
    limit 5
  ) sub
) t on true
limit 20;

(I think I've seen this kind of lateral + json trickery stuff in how Hasura and Postgraphile transform to sql, so I don't this it's unwarranted / bad engineering.)

  • => Is there any ORM out there (except hasura/postgraphile), possibly Postgres-specific, that use this kind of lateral and json stuff, instead of the typical method described above?

Lastly, my research has taught me that in building a graphql api, you'll typically find yourself data-loading (batching) nested queries, instead of eager-loading them from the "parent" query. So, for example, this would be without data-loading:

class ArticleResolver {

  comments(article) {
    db.query("select ... from comments where ... = {article.id}");
  }

and then this would be with data-loading:

class ArticleResolver {

  commentsDataLoader = new DataLoader(articleIds => {
    return db.query("select ... from comments where ... in {articleIds}");
  });

  comments(article) {
    return this.commentsDataLoader.load(article.id);
  }

But, as soon as you want to start adding parameters like limit: 5 to nested queries, this data-loading query gets as complicated as the original question, so we're back where we were :)

  • => Is there a conventional way, of some standard practices, for dealing with this setup? Is there any known way / library so easily write out resolvers like, for example, this:
class ArticleResolver
  ...

  comments(article, limit) {
    return db.somehowMagicallyDataloaded("select * from comments ... = {article.id} limit {limit}")
  }
wildplasser
  • 43,142
  • 8
  • 66
  • 109
Kelley van Evert
  • 1,063
  • 2
  • 9
  • 17

0 Answers0