I'm having trouble understanding how HTTP works when multiple requests are send parallely (before getting a response). There are two cases:
1) With Connection: Keep-Alive
.
According to HTTP spec:
A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.
That way seems to be quite difficult to implement and maintain. The server has to keep track of the order of requests and has to respond in correct order. Not only it might not be easy to implement but there's a performance hit: fast requests have to wait until slow requests are processed if they were issued later.
Also if we are talking about a load balancer then the proxy has to keep track of which request was send to which server so when they come back it can put them in queue and respond in order. So why not make that way in the first place? I.e. it sounds more natural and easier that a client puts (for example) ID
header, the server processes the request and responds with the same ID
header so that the client can match request with response. That is a lot easier to implement and it does not introduce problems with queueing requests (it is up to the client to track the order of requests if it is necessary).
So the question is: what's the reason to specify pipelining in the way it was specified?
2) Without Connection: Keep-Alive
.
I couldn't find any info about that case. Let's say that a client issues two requests A and B. Without keep-alive the server will close the connection after processing the request. This obviously introduces a race condition. So how should it behave? Should it discard the second request?