Apollo - Server(GraphQL): Using Batching together with Caching in REST-APIs not recommended, why?

Question

the documentation of Apollo-Server states, that Batching + Caching should not be used together with REST-API Datasources:

Most REST APIs don't support batching. When they do, using a batched endpoint can jeopardize caching. When you fetch data in a batch request, the response you receive is for the exact combination of resources you're requesting. Unless you request that same combination again, future requests for the same resource won't be served from cache.

We recommend that you restrict batching to requests that can't be cached. In these cases, you can take advantage of DataLoader as a private implementation detail inside your RESTDataSource [...]

Source: https://www.apollographql.com/docs/apollo-server/data/data-sources/#using-with-dataloader

I'm not sure why they say: "Unless you request that same combination again, future requests for the same resource won't be served from cache.".

Why shouldn't future requests be loaded from cache again? I mean, here we have 2 caching layers. The DataLoader which batches requests and memorizes - with an per-request cache - which objects are requested and return the same object from it's cache if requested multiple times in the whole request.

And we have a 2nd level cache, that caches individual objects over multiple requests (Or at least it could be implemented in a way that it caches the individual objects, not the whole result set). Wouldn't that ensure that feature requests would be served from the second layer cache if the whole request changes but includes some of the objects which were requested in a previous request?

score 0 · Answer 1 · answered Jan 31 '22 at 01:03

Many REST APIs implement some sort of request caching for GET requests based on URLs. When you request an entity from a REST endpoint a second time, the result can be returned faster.

For example lets imagine a fictional API "Weekend City Trip".

Your GraphQL API fetches the three largest cities around you and then checks the weather in these cities on the weekend. In this fictional example you receive two requests. The first request is from someone in Germany. You find the three largest cities around them: Cologne, Hamburg and Amsterdam. You can now call the weather API either in a batch or one by one.

/api/weather/Cologne
/api/weather/Hamburg
/api/weather/Amsterdam

or

/api/weather/Cologne,Hamburg,Amsterdam

The next person is in Belgium and we find Cologne, Amsterdam and Brussels.

/api/weather/Cologne
/api/weather/Amsterdam
/api/weather/Brussels

or

/api/weather/Cologne,Amsterdam,Brussels

Now as you can see, without the batching we have requested some URLs twice. The API provider can use a CDN to return these results quickly and not strain their application infrastructure. And since you are probably not the only one using the API, all these URLs might already be cached in the first place, meaning you will receive responses much faster. While the amount of possible batch endpoints grows massively with each city offered and amount of cities offered. If the API provides only 1000 cities, there are 166167000 possible combinations that could be requested when batching three cities. Therefore, the chance that someone else already requested the combination of these three cities might be rather low.

Conclusion

The caching is really just on the API provider side but could greatly benefit your response times as a consumer. Often, GraphQL is used as an API gateway to your own REST services. If you don't cache your services, it can be worth it to use batching in that case.

Okay you have many combinations that could be requested, but it doesn't explain why you shouldn't use batching + caching together (like the documentation stated), cause if you cached each indiviual entity (City in your example) it doesn't matter if somebody requests the exact same combination again or not. For Example, Person One requests Berlin, Amsterdam and Cologne (batched) -> Berlin, Amsterdam & Cologne get`s cached. The next person requests Berlin, Cologne, Brussels (batched) -> API returns cached results Berlin, Cologne and requests Brussels. — Sebi2020, Jan 31 '22 at 01:56
The next time a person requests Cologne, Amsterdam, Brussels the result can be served entirely from cache (although this is a complete new combination). The batching saves round trips and caching improves response times. The documentation of Apollo-Server talks about "[...] using a batched endpoint can jeopardize caching" — Sebi2020, Jan 31 '22 at 02:00

Apollo - Server(GraphQL): Using Batching together with Caching in REST-APIs not recommended, why?

1 Answers1

For example lets imagine a fictional API "Weekend City Trip".

Conclusion