8

I'm wondering if there's any consensus out there with regard to how best to handle GraphQL field arguments when using Dataloader. The batchFn batch function that Dataloader needs expects to receive Array<key> and returns an Array<Promise>, and usually one would just call load( parent.id ) where parent is the first parameter of the resolver for a given field. In most cases, this is fine, but what if you need to provide arguments to a nested field?

For example, say I have a SQL database with tables for Users, Books, and a relationship table called BooksRead that represent a 1:many relationship between Users:Books.

I might run the following query to see, for all users, what books they have read:

query {
  users {
    id
    first_name
    books_read {
      title
      author {
        name
      }
      year_published
    }
  }
}

Let's say that there's a BooksReadLoader available within the context, such that the resolver for books_read might look like this:

const UserResolvers = {
  books_read: async function getBooksRead( user, args, context ) {
    return await context.loaders.booksRead.load( user.id );
  }
};

The batch load function for the BooksReadLoader would make an async call to a data access layer method, which would run some SQL like:

SELECT B.* FROM Books B INNER JOIN BooksRead BR ON B.id = BR.book_id WHERE BR.user_id IN(?);

We would create some Book instances from the resulting rows, group by user_id, then return keys.map(fn) to make sure we assign the right books to each user_id key in the loader's cache.

Now suppose I add an argument to books_read, asking for all the books a user has read that were published before 1950:

query {
  users {
    id
    first_name
    books_read(published_before: 1950) {
      title
      author {
        name
      }
      year_published
    }
  }
}

In theory, we could run the same SQL statement, and handle the argument in the resolver:

const UserResolvers = {
  books_read: async function getBooksRead( user, args, context ) {
    const books_read = await context.loaders.booksRead.load( user.id );
    return books_read.filter( function ( book ) { 
      return book.year_published < args.published_before; 
    });
  }
};

But, this isn't ideal, because we're still fetching a potentially huge number of rows from the Books table, when maybe only a handful of rows actually satisfy the argument. Much better to execute this SQL statement instead:

SELECT B.* FROM Books B INNER JOIN BooksRead BR ON B.id = BR.book_id WHERE BR.user_id IN(?) AND B.year_published < ?;

My question is, does the cacheKeyFn option available via new DataLoader( batchFn[, options] ) allow the field's argument to be passed down to construct a dynamic SQL statement in the data access layer? I've reviewed https://github.com/graphql/dataloader/issues/75 but I'm still unclear if cacheKeyFn is the way to go. I'm using apollo-server-express. There is this other SO question: Passing down arguments using Facebook's DataLoader but it has no answers and I'm having a hard time finding other sources that get into this.

Thanks!

diekunstderfuge
  • 521
  • 1
  • 7
  • 15
  • As an aside, do you really need dataloader in this context? Unless the client is actually requesting `books_read` for the *same* user more than once in the same request, there is no benefit to implementing dataloader for that field. – Daniel Rearden May 23 '19 at 01:11
  • Hi @DanielRearden How do you mean? In my query example I'm assuming that the response will be an array (`[User]`), not a single `User`. Apologies if I wasn't clear on that in my question. Since the query is for many users, I would assume I want dataloader to collect all `user_id`s so I can send them to a SQL statement like `SELECT id, first_name FROM User WHERE id IN(?);` Since `books_read` is a field on each individual user, and the `parent` for the resolver is a single `User`, wouldn't I also want dataloader to batch those `user_id`s? – diekunstderfuge May 23 '19 at 12:38
  • Let's [continue this conversation in chat](https://chat.stackoverflow.com/rooms/193841/handling-graphql-field-arguments-using-dataloader) – Daniel Rearden May 23 '19 at 14:19

1 Answers1

5

Pass the id and params as a single object to the load function, something like this:

const UserResolvers = {
  books_read: async function getBooksRead( user, args, context ) {
    return context.loaders.booksRead.load({id: user.id, ...args});
  }
};

Then let the batch load function figure out how to satisfy it in an optimal way.

You'll also want to do some memoisation for the construction of the object, because otherwise dataloader's caching won't work properly (I think it works based on identity rather than deep equality).

Andrew Ingram
  • 5,160
  • 2
  • 25
  • 37
  • 2
    You could also pass a custom cacheKeyFn or cacheMap to Dataloader which does something like JSON stringifying the cache key, then you wouldn't need to memoise. – Andrew Ingram May 24 '19 at 12:22
  • while it's a good hack but this breaks typing, not compatible with typescript – doc_id Jun 08 '21 at 13:53