How should HTTP/1.1 servers handle requests containing Content-Length headers with empty values?

Question

RFC 9110 defines the Content-Length ABNF rule like this:

Content-Length = 1*DIGIT

(i.e. 1 or more ASCII digits)

The same also says this:

a sender MUST NOT forward a message with a Content-Length header field value that does not match the ABNF above, with one exception: a recipient of a Content-Length header field value consisting of the same decimal value repeated as a comma-separated list (e.g, "Content-Length: 42, 42") MAY either reject the message as invalid or replace that invalid field value with a single instance of the decimal value, since this likely indicates that a duplicate was generated or combined by an upstream message processor.

Thus, if a proxy receives the following request, it should not be forwarded with the Content-Length header unchanged:

GET / HTTP/1.1\r\n
Host: whatever\r\n
Content-Length: \r\n
\r\n

How should an endpoint server handle such a request? Should it be rejected? If it should instead be accepted, then how should it be interpreted?

I have taken a wide survey of the HTTP ecosystem on this issue, and implementations seem to be varied. I am interested in hearing opinions on both sides of the accept/reject divide.

EDIT: RFC 9112 also says this:

When a server listening only for HTTP request messages, or processing what appears from the start-line to be an HTTP request message, receives a sequence of octets that does not match the HTTP-message grammar aside from the robustness exceptions listed above, the server SHOULD respond with a 400 (Bad Request) response and close the connection.

This statement seems to invalidate the whole ABNF, and justifies accepting empty CL. That said, if matching the grammar is only a soft requirement, what is the point of the standards?

`400 Bad Request` seems reasonable here. The spec seems clear — Evert, Aug 02 '23 at 00:52
In the quote you shared after the edit, the word `SHOULD` appears. The uppercase words in rfcs have specific meanings: https://datatracker.ietf.org/doc/html/rfc2119 — Evert, Aug 02 '23 at 15:35
I am aware of this, which is why I refer to it as a "soft requirement." I'm confused as to why implementing the grammar is a SHOULD, given that the rules of the grammar are arguably the most important part of the standard. — kenballus, Aug 02 '23 at 17:49
Sounds like this question moved from 'what to do' , to 'why' but I would suggest you take a look at HTTP/1.0 to get a better feel for standards design in the 90's. HTTP/1.1 and all it's revisions directly descend from this and was made to be backwards compatible. It should be no wonder that a protocol with this legacy and so many implementations incorporates design decisions that don't match the sensibilities from nearly 3 decades later. — Evert, Aug 03 '23 at 00:09
Plus this was also a time where Postel's law was still widely considered a good idea. Protocol designers have largely peddled back on this. Anyway, welcome to the wonderful world of internet standards. It's a wild world out there. You've seen nothing yet :P — Evert, Aug 03 '23 at 00:09

How should HTTP/1.1 servers handle requests containing Content-Length headers with empty values?

0 Answers0