0

I'm working on a server that accepts a URL from a user and downloads it (and does other stuff to it ofc like uploading it back but that's irrelevant here). The maximum file size it should accept is 4 GB which is why the Content-Length MUST exist for the URL that the user provides.

But what happens if say, a malicious server gives a Content-Length of say 2 GB, and ends up transferring 6 GB instead? Are there are mechanisms in place to stop that? I'm using the Rust library reqwest but answers for other HTTP clients would be great too.

  • I'm not a specialist in this area and I'm not sure but I remember I had the same question and the answer is that the client ignores the rest of the data. As HTTP is transferred over TCP it is possible that the client stops receiving data. BTW, currently, it's possible that the connection remains alive and the next HTTP responses come after. So, I'm not sure what exactly happen in those cases. – momvart Jan 04 '21 at 07:27

1 Answers1

1

A common implementation will just take the Content-length and read as much data as specified - leaving the remaining data in the socket buffer (or maybe some user space buffer). Thus it likely works for this specific request.

But this might actually cause trouble in case of HTTP persistent connection. For a request with a too short Content-length the remaining data will be interpreted as another HTTP request on the same connection. For a response with a similar problem the remaining data will be interpreted as the response to the next request on the connection. In the best case this will be treated as an error due to malformed data and the request will be abandoned. In the worst case it might lead to a security issue though - see also HTTP request and response splitting as a related attack.

... which is why the Content-Length MUST exist for the URL

Please note that Content-length is not actually required in the request or response. The message header might have no indication on the ultimate size of the response, since it might use Transfer-Encoding: chunked or just end with the close of a TCP connection.

Steffen Ullrich
  • 114,247
  • 10
  • 131
  • 172
  • My server expects the Content-Length header to be there and if it isn't there it displays the error "file size couldn't be determined". That's okay right? – Krey Lazory Jan 04 '21 at 09:50
  • @KreyLazory: It depends on how much you know or can control what the client is doing. With a local file upload browsers will currently send a `Content-length` header. For other kind of clients or for other use cases - who knows. I'v e seen mobile apps using Transfer-Encoding chunked and not Content-length. – Steffen Ullrich Jan 04 '21 at 09:55
  • @SteffenUllrich If I understand the OP correctly this is not really a client issue, their server is going to be sending requests to other servers, and it is the responses from those other servers that are to be required to have `content-length`. But I guess the issue of chunked encoding remains the same, possibly even more so. – Michał Politowski Jan 04 '21 at 13:32
  • @KreyLazory I do not actually know reqwest, but would expect it to be able to give you the response as a stream that you just can stop reading whenever you like. Then any upfront checking of `content-length` if present would only be an optimization for servers that send it. – Michał Politowski Jan 04 '21 at 13:36
  • @MichałPolitowski: You are right, I did understand the question first as asking about the response, then about the request - but it seems to be about the response. Anyway, the answer covers now both cases since the impact is actually similar. And with a response from a third party server one has likely even less control if the response will contain a content-length header or not. – Steffen Ullrich Jan 04 '21 at 13:52
  • @MichałPolitowski Yes it indeed does just that. One thing that my server does is upload the video to YouTube, and the YouTube API requires me to know the size of the file beforehand (at least, that is my interpretation of it). I can't know that without downloading the entire file which I'd prefer not to do because of disk space issues. – Krey Lazory Jan 05 '21 at 12:15