Response body when PATCHing a collection

Question

In my REST API I have a very large collection; it contains millions of items. The path for this collection is /mycollection

Because this collection is so large it is not good practice to GET the whole collection so the API supports paging. Paging will be the primary way of getting the collection

GET /mycollection?page=1&page-size=100 HTTP/1.1

Say the original collection contains 1,000,000 items and I want to update 5,000, delete 3,000 and add 2,000 items. I could write my API to support updating the collection via either the PUT method or the PATCH method. While either method would require very different request bodies I believe both methods would require the exact same response body, i.e. the response body would have to contain the current representation of the entire updated resource, i.e. all 999,000 items in the collection.

As I mentioned earlier GETting the entire collection is just not realistic; it's too big. For the same reason I don't want PUTting or PATCHing to return the entire collection. Adding query parameters to a PUT or PATCH request wouldn't work either because neither PUT nor PATCH are safe methods.

So what would be the proper response in this large collection scenario?

I could respond with

HTTP/1.1 202 Accepted
Location: /mycollection?page=1&page-size=100

The 202 Accepted response code doesn't feel correct because the update would have been done synchronously. The Location header doesn't quite feel right either. I could maybe go with a Links header, but still it doesn't feel right.

Again I ask what would be the proper response in this large collection scenario?

Evert · Accepted Answer · 2020-11-06T21:44:50.263

This question is based on a misconception:

While either method would require very different request bodies I believe both methods would require the exact same response body, i.e. the response body would have to contain the current representation of the entire updated resource

Either can just return 204 No Content or 200 OK and no response body. There's no requirement that they include the full representation in the response.

You could optionally support this (perhaps along with the Prefer: return=representation header, or perhaps Content-Location header), but without this header I would say it's not even a convention that the current representation is returned. Generic clients shouldn't assume that the response body is the new representation unless these headers are used.

So, just return a 2xx and you're good to go.

score 0 · Answer 2 · edited Oct 07 '21 at 13:39

So what would be the proper response in this large collection scenario?

Short version: you should probably treat a successful PUT as though it were a successful POST.

the intended meaning of the payload can be summarized as: a representation of the status of, or results obtained from, the action

So the response could be as simple as

200 OK
Content-Type: text/plain

It worked!

Longer answer:

While either method would require very different request bodies I believe both methods would require the exact same response body, i.e. the response body would have to contain the current representation of the entire updated resource

This isn't right - If you review RFC 7231, you'll see that the response to PUT has this description

a representation of the status of the action

Returning the new representation of the resource is an edge case, not the default (see the specification of the Content-Location header).

For a state-changing request like PUT (Section 4.3.4) or POST (Section 4.3.3), it implies that the server's response contains the new representation of that resource, thereby distinguishing it from representations that might only report about the action (e.g., "It worked!"). This allows authoring applications to update their local copies without the need for a subsequent GET request.

That said, I'd suggest a review of your choice of method token. Both PUT and PATCH support remote authoring semantics - messages that ask a server to make its copy of a document look like your local copy. That's why, for example, the PUT specification has a bunch of constraints about adding validator header fields to the response. General purpose components are allowed to assume that they know what's going on, because all resources are supposed to understand these methods the same way.

But in your case, you can't really be said to be remote authoring the collection, because the client (and the general purpose components) don't have a representation of the collection, but instead only representations of pages of the collection.

If you were going to be consistent with the uniform interface, then you would either

allow remote authoring of the pages, or
abandon the method tokens that imply remote authoring

It is okay to use POST when the semantics of your request don't quite align with the standardized meanings

POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”

Response body when PATCHing a collection

2 Answers2