0

Section 6.9 of RFC 7540 describes the mechanism for HTTP/2 flow control. There is a flow control window for each connection, and another flow control window for all streams on that connection. It provides a way for the receiver to set the initial flow control window for a stream:

Both endpoints can adjust the initial window size for new streams by including a value for SETTINGS_INITIAL_WINDOW_SIZE in the SETTINGS frame that forms part of the connection preface.

And a way for the receiver to increase the connection and stream flow control windows:

The payload of a WINDOW_UPDATE frame is one reserved bit plus an unsigned 31-bit integer indicating the number of octets that the sender can transmit in addition to the existing flow-control window. The legal range for the increment to the flow-control window is 1 to 2^31-1 (2,147,483,647) octets.

[...]

A sender that receives a WINDOW_UPDATE frame updates the corresponding window by the amount specified in the frame.

And a way for the receiver to increment or decrement the flow control windows for all streams (but not the connection) at once:

When the value of SETTINGS_INITIAL_WINDOW_SIZE changes, a receiver MUST adjust the size of all stream flow-control windows that it maintains by the difference between the new value and the old value.

But as far as I can tell, there is no way for the receiver to decrement a single stream's flow control window without changing the initial window size.

Is that correct? If so, why not? This seems like a reasonable thing to want to do if you are multiplexing many long-lived streams over a single connection. You may have some BDP-controlled memory budget for the overall connection, carved up across the streams, and are tuning the proportion that each stream gets according to its recent bandwidth demand. If one of them temporarily goes idle you'd like to be able to reset its window to be small so that it doesn't strand the memory budget, without affecting the other streams, and without making it impossible to receive new streams.

(Of course I understand that there is a race, and the sender may have sent data before receiving the decrement. But the window is already allowed to go negative due to the SETTINGS_INITIAL_WINDOW_SIZE mechanism above, so it seems like it would be reasonable to allow for a negative window here too.)

Is it really not possible to do this without depending on forward progress from the sender to eat up the stranded bytes in the flow control window?


Here's more detail on why I'm interested in the question, because I'm conscious of the XY problem.

I'm thinking about how to solve an RPC flow control issue. I have a server with a limited memory budget, and incoming streams with different priorities for how much of that memory they should be allowed to consume. I want to implement something like weighted max-min fairness across them, adjusting their flow control windows so that they sum to no more than my memory budget, but when we're not memory constrained we get maximum throughput.

For efficiency reasons, it would be desirable to multiplex streams of different priorities on a single connection. But then as demands change, or as other connections show up, we need to be able to adjust stream flow control windows downward so they still sum to no more than the budget. When stream B shows up or receives a higher priority but stream A is sitting on a bunch of flow control budget, we need to reduce A's window and increase B's.

Even without the multiplexing, the same problem applies at the connection level: as far as I can tell, there is no way to adjust the connection flow control window downward without changing the initial window size. Of course it will be adjusted downward as the client sends data, but I don't want to need to depend on forward progress from the client for this, since that may take arbitrarily long.

It's possible there is a better way to achieve this!

jacobsa
  • 5,719
  • 1
  • 28
  • 60

1 Answers1

0

A server that has N streams, of which some idle and some actively downloading data to the client, will typically re-allocate the connection window to active streams.

For example, say you are watching a movie and downloading a big file from the same server at the same time.

Connection window is 100, and each stream has a window of 100 too (obviously in case of many streams the sum of all stream windows will be capped by the connection window, but if there is only one stream it can be at max).

Now, when you watch and download each stream gets 50.

If you pause the movie, and the server knows about that (i.e. it does not exhaust the movie stream window), then the server now has to serve only one stream, with a connection window of 100 and a single stream (the download one) that also has window of 100, therefore reallocating the whole window to the active stream.

You only get into problems if the client doesn't tell the server that the movie has been paused. In this case, the server will continue to send movie data until the movie stream window is exhausted (or quasi exhausted), and the client does not acknowledges that data because it's paused. At that point, the server notices that data is not acknowledged by one stream and stops sending data to it, but of course part of the connection window is taken, reducing the window of the active download stream.

From the server point of view, it has a perfectly good connection where one stream (the download one) works wonderfully at max speed, but another stream hiccups and exhausts its window and causes the other stream to slow down (possibly to a halt), even if it's the same connection!

Obviously it cannot be a connection/communication issue, because one stream (the download one) works perfectly fine at max speed. Therefore it is an application issue.

The HTTP/2 implementation on the server does not know that one of the streams is a movie that can be paused -- it's the application that must communicate this to the server and keep the connection window as large as possible.

Introducing a new HTTP/2 frame to "pause" downloads (or changing the semantic of the existing frames to accommodate a "pause" command) would have complicated the protocol quite substantially, for a feature that is 100% application driven -- it is the application that must trigger the send of the "pause" command but at that point it can send its own "pause" message to the server without complicating the HTTP/2 specification.

It is an interesting case where HTTP/1.1 and HTTP/2 behave very differently and require different code to work in a similar way.

With HTTP/1.1 you would have one connection for the movie and one for the download, they would be independent, and the client application would not need to communicate to the server that the movie was paused -- it could just stop reading from the movie connection until it became TCP congested without affecting the download connection -- assuming that the server is non-blocking to avoid scalability issues.

sbordet
  • 16,856
  • 1
  • 50
  • 45
  • Thanks. I understand this example, but it's actually the opposite of my situation. In the streaming/downloading example I'm talking about the case where the client wants to bound its memory usage, and the _server_ has a hiccup with serving the movie. (Maybe the storage it's being fetched from is contended, but the download isn't.) The _client_ is the receiver here, and is the one that needs to be able to re-allocate the bytes in the window for the movie stream to the download; otherwise the download goes at half speed. But it seems this can't happen until the server sends more movie bytes. – jacobsa Jul 08 '21 at 23:45
  • I'm not sure I understand your case. If the server has a hiccup sending the movie bytes, the client would have acknowledged all the movie bytes, so the connection window is fully open (100) and available for the only active stream remaining, the file download stream, which can then be downloaded at max speed. – sbordet Jul 09 '21 at 13:21
  • You're right about the connection window, but I'm talking about per-stream windows. As I mentioned in the question, I have a fixed memory budget for the connection and want to partition it for the various streams. If the movie currently has 75% of the budget and then goes idle, that 75% of the budget is stranded until more bytes are delivered. (Or we have to over-commit the budget by a factor of 1.75.) I've now added more info at the bottom of the question about my use case. – jacobsa Jul 10 '21 at 21:08
  • Sorry it's still not clear to me what your case is. You seem to refer to the client, but in your question edit you say that you have a server memory limit, but also say you don't have it. I frankly don't see your problem -- as long as the client is acknowledging data, it keeps the connection (and stream) window fully open and only active streams will have bytes sent. If you want to change the priorities and weights, you can do it with `PRIORITY` frames, not flow control. In a nutshell I don't see why you insist on wanting to decrement the stream window size? – sbordet Jul 12 '21 at 12:44
  • HTTP2 streams are bidirectional; both the client and the server can be senders and are subject to flow control. See section 5.2.1 of the RFC. In my original question I only referred to "sender" and "receiver" for this reason, but in the update I am giving details about my specific case, where the server is the receiver. In your example the client is the receiver, so the question is inverted. – jacobsa Jul 13 '21 at 20:55
  • Forget streams for a moment and see the second-to-last paragraph in the question: the same question applies at the connection level. If I have a small RAM budget on the receiver and want to divvy it up across connection windows, there seems to be no way for me to "reclaim" the memory I devoted to a connection when that connection goes idle, except to kill the connection. – jacobsa Jul 13 '21 at 20:57
  • From the point of the HTTP/2 implementation and flow control handling that we are discussing here, if a connection goes idle it consumes exactly zero memory: there will be no buffers allocated, provided the application acknowledged the received content. If you receive 100k of bytes for a 200k file that you want to save to disk, this is what happens: the implementation allocates a 16k buffer, reads into it, and passes it to the application, which saves it to disk, then the application acknowledges the consumption of the bytes so that the implementation can reuse the buffer. 1/N – sbordet Jul 15 '21 at 07:43
  • The implementation reuses the 16k buffer to read another chunk, and the steps above repeat. At 100k received, saved to disk and acknowledged, the sender stops sending; the receiver implementation will try to read more, but would read 0 bytes, at which point can deallocate the 16k buffer. At this point the receiver application has 0 bytes allocated (the bytes are saved to disk) and the HTTP/2 implementation also has 0 bytes allocated for that stream. 2/N – sbordet Jul 15 '21 at 07:48
  • Of course you can write a bad application and/or a bad HTTP/2 implementation and complain that memory remains retained, and require extra mechanisms to try to cope with that. However, good applications and good implementations do no require extra flow control features. If you have already implemented priorities in the sender, that should be enough to let you control the relative data transfer proportions among streams, with minimal memory consumption during data transfer, and no memory consumption (by the implementation) when idle. 3/3 – sbordet Jul 15 '21 at 07:56