Section 6.9 of RFC 7540 describes the mechanism for HTTP/2 flow control. There is a flow control window for each connection, and another flow control window for all streams on that connection. It provides a way for the receiver to set the initial flow control window for a stream:
Both endpoints can adjust the initial window size for new streams by including a value for
SETTINGS_INITIAL_WINDOW_SIZE
in theSETTINGS
frame that forms part of the connection preface.
And a way for the receiver to increase the connection and stream flow control windows:
The payload of a
WINDOW_UPDATE
frame is one reserved bit plus an unsigned 31-bit integer indicating the number of octets that the sender can transmit in addition to the existing flow-control window. The legal range for the increment to the flow-control window is 1 to 2^31-1 (2,147,483,647) octets.[...]
A sender that receives a WINDOW_UPDATE frame updates the corresponding window by the amount specified in the frame.
And a way for the receiver to increment or decrement the flow control windows for all streams (but not the connection) at once:
When the value of
SETTINGS_INITIAL_WINDOW_SIZE
changes, a receiver MUST adjust the size of all stream flow-control windows that it maintains by the difference between the new value and the old value.
But as far as I can tell, there is no way for the receiver to decrement a single stream's flow control window without changing the initial window size.
Is that correct? If so, why not? This seems like a reasonable thing to want to do if you are multiplexing many long-lived streams over a single connection. You may have some BDP-controlled memory budget for the overall connection, carved up across the streams, and are tuning the proportion that each stream gets according to its recent bandwidth demand. If one of them temporarily goes idle you'd like to be able to reset its window to be small so that it doesn't strand the memory budget, without affecting the other streams, and without making it impossible to receive new streams.
(Of course I understand that there is a race, and the sender may have sent data before receiving the decrement. But the window is already allowed to go negative due to the SETTINGS_INITIAL_WINDOW_SIZE
mechanism above, so it seems like it would be reasonable to allow for a negative window here too.)
Is it really not possible to do this without depending on forward progress from the sender to eat up the stranded bytes in the flow control window?
Here's more detail on why I'm interested in the question, because I'm conscious of the XY problem.
I'm thinking about how to solve an RPC flow control issue. I have a server with a limited memory budget, and incoming streams with different priorities for how much of that memory they should be allowed to consume. I want to implement something like weighted max-min fairness across them, adjusting their flow control windows so that they sum to no more than my memory budget, but when we're not memory constrained we get maximum throughput.
For efficiency reasons, it would be desirable to multiplex streams of different priorities on a single connection. But then as demands change, or as other connections show up, we need to be able to adjust stream flow control windows downward so they still sum to no more than the budget. When stream B shows up or receives a higher priority but stream A is sitting on a bunch of flow control budget, we need to reduce A's window and increase B's.
Even without the multiplexing, the same problem applies at the connection level: as far as I can tell, there is no way to adjust the connection flow control window downward without changing the initial window size. Of course it will be adjusted downward as the client sends data, but I don't want to need to depend on forward progress from the client for this, since that may take arbitrarily long.
It's possible there is a better way to achieve this!