0

So we have a server application which communicates with clients through websocket and we need to regularly upgrade our server app binary. We may have multiple server instances, I know that I can offline the instance I'm upgrading but you know since websocket is long-lived tcp connections so the instance still holds some existing connections with the client, we don't want these existing connections disconnect.

I have known that I can do some work in our server application implementation to delegate the existing connections to a sub-process. But I'm wondering now if I could use AWS ELB to mitigate this issue.

I'm not quite sure how ALB/NLB works exactly, but I'm thinking since ALB/NLB is hiding servers from clients, it's the ALB/NLB holds the connections with clients. Then during server upgrades, could ALB/NLB migrate the existing connections related to the upgrading instance to the ones that has already finished upgrading?

Say we have two server instances A and B, the procedure would looks like:

  1. take off A from the ALB/NLB
  2. new connections from the clients will only forwards to B
  3. ALB/NLB migrates the existing connections with A to B
  4. A finished upgrading, put A onto the ALB/NLB
  5. new connections from the clients forwards to A again
  6. repeat the procedure on B

During the whole process, the client is unaware what happens on the cloud.

Does ALB/NLB support the stuff like this?

cifer
  • 615
  • 1
  • 9
  • 25

1 Answers1

0

No, this is not possible. The reason for it doesn't lie in ELB, but in TCP itself. TCP is a stateful protocol and the WebSocket protocol on top of it also. What ELB does, is forwarding each and every packet to the TargetGroup. If it has to do the things you want it to, it must have understanding of the higher level protocols.

Another take: When the target of the ELB switches, the ELB would need to handshake a new session. The client has the state: Session already instantiated, but for the server the connection is new. So it would be on the ELBs-side to handshake the new session. I hope you can see where this is going.

If you need rolling updates without breaking client sessions, I think you could try to use Connection Draining (now called deregistration delay ) with a timeout longer than your sessions.

Another alternative would be Api Gateway which works in a way you want the ELB to work. It holds the WebSocket connection and only contacts the Lambda if a message is sent. The server can also send messages by contacting a specialised url and sending data this way.

Augunrik
  • 1,866
  • 1
  • 21
  • 28
  • While the ultimate answer arrived at here -- *this isn't supported* -- is correct, some of the details are not precise. Network Load Balancers forward packets with minimal awareness of higher layers, but Application Load Balancers do not simply forward TCP packets. They do understand the higher-layer protocols and establish new TCP connections on the back side. Based on each HTTP request, they forward the request, and (in the case of web sockets) internally tie the payload streams of the TCP sessions together and then sit there forwarding data until the connection closes, no reconnect/retry. – Michael - sqlbot Apr 10 '23 at 10:31
  • 1
    True, I focused more on the WebSocket case as this was the question. These do have a HTTP init component, but are then not considered HTTP traffic anymore. – Augunrik Apr 10 '23 at 11:51