8

I need to delete multiple items by id in the batch however HTTP DELETE does not support a body payload.

Work around options:

1. @DELETE /path/abc?itemId=1&itemId=2&itemId=3 on the server side it will be parsed as List of ids and DELETE operation will be performed on each item.

2. @POST /path/abc including JSON payload containing all ids. { ids: [1, 2, 3] }

How bad this is and which option is preferable? Any alternatives?

Update: Please note that performance is a key here, it is not an option execute delete operation for each individual id.

Wild Goat
  • 3,509
  • 12
  • 46
  • 87
  • 1
    Both are inadvisable, since http methods in batch where part of the batch fails will be problematic in returning the resulting http status code. The alternative would be for the client to be responsible for the batch operation. – Mr. Wrong Apr 04 '19 at 11:49
  • @Mr.Wrong how client could be responsible for batching? The whole point of batching is to optimize a process over than execute requests one by one. – Wild Goat Apr 04 '19 at 12:54
  • In addition to what Mr. Wrong stated, your two proposals will prevent a(n intermediary) cache from invalidating any of the stored response representations for the invoked URI, which is basically a cache-key including any path, matrix or query parameters. A request for `GET /path/abc?itemId=1` therefore might still get served by a cache rather then get served by the actual server even though the actual resource might already got deleted via batch. – Roman Vottner Apr 04 '19 at 13:03
  • @RomanVottner Not sure about that, this is the same as having any other MUTATION operation. If you add multiple items into the category "A" cache for that category "A" has to be refreshed. Same thing with delete. If you follow your logic means that you can't use REST for bulk operations at all. – Wild Goat Apr 04 '19 at 13:30
  • @WildGoat While [RFC 7234](https://tools.ietf.org/html/rfc7234#section-4.4) talks about invalidating any cached information if a mutating operations is performed, a cache uses the [effective request URI](https://tools.ietf.org/html/rfc7230#section-5.5) to determine the target resource. Usually adding new items to a collection happens via `POST /path/to/collections` whereas retrieval of a specific item happens via `GET path/to/collections/item` which is a different key than the one you used for storing new items. Updating or deleting that specific item will invalidate the cache however OOTB. – Roman Vottner Apr 04 '19 at 13:52
  • @WildGoat ... though neither of your suggestion target that particular URI of the concrete item but only the one of the "collection" managing the item. While caching is [optional](https://tools.ietf.org/html/rfc7234#section-2) in regards to HTTP, Fielding made caching actually [a constraint of REST](https://en.wikipedia.org/wiki/Representational_state_transfer#Cacheability) which therefore should be supported by your applications. – Roman Vottner Apr 04 '19 at 14:00

3 Answers3

21

Along the years, many people fell in doubt about it, as we can see in the related questions here aside. It seems that the accepted answers ranges from "for sure do it" to "its clearly mistreating the protocol". Since many questions was sent years ago, let's dig into the HTTP 1.1 specification from June 2014 (RFC 7231), for better understanding of what's clearly discouraged or not.

The first proposed workaround:

First, about resources and the URI itself on Section 2:

The target of an HTTP request is called a "resource". HTTP does not limit the nature of a resource; it merely defines an interface that might be used to interact with resources. Each resource is identified by a Uniform Resource Identifier (URI).

Based on it, some may argue that since HTTP does not limite the nature of a resource, a URI containing more than one id would be possible. I personally believe it's a matter of interpretation here.

About your first proposed workaround (DELETE '/path/abc?itemId=1&itemId=2&itemId=3') we can conclude that it's something discouraged if you think about a resource as a single document in your entity collection while being good to go if you think about a resource as the entity collection itself.

The second proposed workaround:

About your second proposed workaround (POST '/path/abc' with body: { ids: [1, 2, 3] }), using POST method for deletion could be misleading. The section Section 4.3.3 says about POST:

The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others): Providing a block of data, such as the fields entered into an HTML form, to a data-handling process; Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles; Creating a new resource that has yet to be identified by the origin server; and Appending data to a resource's existing representation(s).

While there's some space for interpretation about "among others" functions for POST, it clearly conflicts with the fact that we have the method DELETE for resources removal, as we can see in Section 4.1:

The DELETE method removes all current representations of the target resource.

So I personally strongly discourage the use of POST to delete resources.

An alternative workaround:

Inspired on your second workaround, we'd suggest one more:

DELETE '/path/abc' with body: { ids: [1, 2, 3] }

It's almost the same as proposed in the workaround two but instead using the correct HTTP method for deletion. Here, we arrive to the confusion about using an entity body in a DELETE request. There are many people out there stating that it isn't valid, but let's stick with the Section 4.3.5 of the specification:

A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request.

So, we can conclude that the specification doesn't prevent DELETE from having a body payload. Unfortunately some existing implementations could reject the request... But how is this affecting us today?

It's hard to be 100% sure, but a modern request made with fetch just doesn't allow body for GET and HEAD. It's what the Fetch Standard states at Section 5.3 on Item 34:

If either body exists and is non-null or inputBody is non-null, and request’s method is GET or HEAD, then throw a TypeError.

And we can confirm it's implemented in the same way for the fetch pollyfill at line 342.

Final thoughts:

Since the alternative workaround with DELETE and a body payload is let viable by the HTTP specification and is supported by all modern browsers with fetch and since IE10 with the polyfill, I recommend this way to do batch deletes in a valid and full working way.

stellr42
  • 3,365
  • 2
  • 21
  • 33
Erick Petrucelli
  • 14,386
  • 8
  • 64
  • 84
  • The jury isn't entirely out on what to do with `DELETE` and request bodies, but the rfc's definitely currently don't mean to imply that that a request body on `DELETE` has semantic meaning at all, although this is a common confusion. This _may_ change, but as of right now the intent is that a request body on `DELETE` may appear, but it has no meaning. – Evert Apr 04 '19 at 14:14
  • 1
    I recently brought this up to the group that develops the next version of the HTTP standard. I would definitely suggest to hold off using `DELETE` until it's really sanctioned by the http working group: https://github.com/httpwg/http-core/issues/202 – Evert Apr 04 '19 at 14:16
  • 1
    @Evert This will however only work on HTTP clients/servers that also use that new protocol version. Plenty of servers/clients will not support that kind of feature for probably years to come. In addition to that it is unclear how a batch-delete should inform intermediary caches to invalidate stored responses for those items that got deleted within the batch request. In its current form some servers which ignore the payload might remove the whole collection instead of only those few mentioned items. A well defined new operation (`BATCH-DELETE`) might be needed eventually?! – Roman Vottner Apr 04 '19 at 14:32
  • 7
    I think this answer deserves an upvote because everyone is criticizing an bulk approach but no one offers a solution. Obviously there is a need for it! – Wild Goat Apr 04 '19 at 14:42
  • @WildGoat the biggest thing that's missing from the question is an explanation of why batch deletes are needed in the first place. If you have a need to delete a list of resources, the canonical way to do that is to use multiple HTTP requests. By requiring a body you are just pushing down HTTP semantics to the body. What is the core problem? – Evert Apr 04 '19 at 15:28
  • @Evert The core problem is a performance. Obviously I can do it one by one in for loop but deleting 1000 items and opening/closing db connections 1000 times is not an option. Hope it make sense? – Wild Goat Apr 04 '19 at 15:59
  • 1
    @WildGoat I typically keep database connections pooled and open. I guess you're working with a language/framework where that's not an option? One idea could be to send everything into a queue and do the batch processing off the main HTTP thread. – Evert Apr 04 '19 at 16:13
  • Anyway, if you are completely uninterested in a creative approach that keeps proper HTTP semantics, the right solution is to use `POST` with a custom request body listed as the 'Second proposal' here. – Evert Apr 04 '19 at 16:14
  • @WildGoat In addition to what Evert said, HTTP connections can be [pipelined](https://tools.ietf.org/html/rfc7230#section-6.3.2) as well allowing you to send multiple requests in batch over the same connection i.e. each targeting one specific item to delete. Here you can easily use `DELETE` on the URI that targets the concrete item instead of its managing "collection" resource – Roman Vottner Apr 04 '19 at 16:17
  • 3
    Guys, I am more than open to follow a proper HTTP sematintics, however non of these approaches are feasible enough in terms of performance or development effort. This module is a plug-in into existing architecture, realistically we can't call api 1000 times one by one and obviously we have no human resources to implement "creative" solution, we just need to get a job done by executing this in bulk. – Wild Goat Apr 04 '19 at 16:51
  • For the implementations which reject body in `DELETE` request, if bulk operation is required could something like this be another workaround ? : `DELETE /path/abc?ids=1,2,3` – Ham Aug 31 '23 at 16:10
  • 1
    @Ham, I briefly discussed that in the "The first proposed workaround" section of my answer. In short, it depends on how you interpret the statement about one endpoint representing one resource. If you interpret an entity as a resource, you shouldn't consider dealing with multiple entities with a single endpoint. If you interpret the entity collection itself as a resource, then you're good to go this way. And just to be more practical here, I myself like to interpret the second way. – Erick Petrucelli Sep 01 '23 at 13:58
3

It's important to understand that the HTTP methods operate in the domain of "transferring documents across a network", and not in your own custom domain.

Your resource model is not your domain model is not your data model.

Alternative spelling: the REST API is a facade to make your domain look like a web site.

Behind the facade, the implementation can do what it likes, subject to the consideration that if the implementation does not comply with the semantics described by the messages, then it (and not the client) are responsible for any damages caused by the discrepancy.

DELETE /path/abc?itemId=1&itemId=2&itemId=3

So that HTTP request says specifically "Apply the delete semantics to the document described by /path/abc?itemId=1&itemId=2&itemId=3". The fact that this document is a composite of three different items in your durable store, that each need to be removed independently, is an implementation detail. Part of the point of REST is that clients are insulated from precisely this sort of knowledge.

However, and I feel like this is where many people get lost, the metadata returned by the response to that delete request tells the client nothing about resources with different identifiers.

As far as the client is concerned, /path/abc is a distinct identifier from /path/abc?itemId=1&itemId=2&itemId=3. So if the client did a GET of /path/abc, and received a representation that includes itemIds 1, 2, 3; and then submits the delete you describe, it will still have within its own cache the representation that includes /path/abc after the delete succeeds.

This may, or may not, be what you want. If you are doing REST (via HTTP), it's the sort of thing you ought to be thinking about in your design.

POST /path/abc

some-useful-payload

This method tells the client that we are making some (possibly unsafe) change to /path/abc, and if it succeeds then the previous representation needs to be invalidated. The client should repeat its earlier GET /path/abc request to refresh its prior representation rather than using any earlier invalidated copy.

But as before, it doesn't affect the cached copies of other resources

/path/abc/1
/path/abc/2
/path/abc/3

All of these are still going to be sitting there in the cache, even though they have been "deleted".

To be completely fair, a lot of people don't care, because they aren't thinking about clients caching the data they get from the web server. And you can add metadata to the responses sent by the web server to communicate to the client (and intermediate components) that the representations don't support caching, or that the results can be cached but they must be revalidated with each use.

Again: Your resource model is not your domain model is not your data model. A REST API is a different way of thinking about what's going on, and the REST architectural style is tuned to solve a particular problem, and therefore may not be a good fit for the simpler problem you are trying to solve.

That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style. -- Fielding, 2008

VoiceOfUnreason
  • 52,766
  • 5
  • 49
  • 91
1

As Erick has already mentioned in the accepted answer, nothing can stop you from sending a request body along with DELETE request.

However, I would like to suggest to look at the third option:

PATCH /path/abc

[
  { "op": "remove", "path": "/path/abc/1" },
  { "op": "remove", "path": "/path/abc/2" },
  { "op": "remove", "path": "/path/abc/3" }
]

This example payload tries to follow RFC 6902 which defines a JSON document structure for operations.

You can have a more 'relaxed' request body, for example, just listing IDs to remove:

PATCH /path/abc

[1, 2, 3]

This doesn't look 'conventional', but as long as your application understands it this might be an alternative, because DELETE /path/abc sounds more like wiping abc out together with all its IDs and other attributes :)

William Durand did a good write-up on different ways of patching.

Cheers!

Mαx Φ
  • 94
  • 1