1

We have a system where clients open bi-directional grpc stream to ALB, which proxies to one of active server. So

              bi-di                      client <----------->  ALB  <--------> server

In-case of any failure of connection, clients re-connects to us as we want to keep a bi-di channel open & active. 

Question is : How can we keep the channel alive even if there is no activity for sometime. ALB are configured with 300 sec idle-timeout which means it will drop the connection if no packets are exchanged in 300 sec. 

I read on grpc page at https://grpc.io/blog/grpc-on-http2/#keeping-connections-alive , we should use keep alive settings on both sides. So I tried below configuration

bi-di client channel with :  keepAliveWithoutCalls(true).keepAliveTime(90, TimeUnit.SECONDS).keepAliveTimeout(10, TimeUnit.SECONDS)

And

Server is configured with : permitKeepAliveWithoutCalls(true).permitKeepAliveTime(1, TimeUnit.MINUTES)

But I received INTERNAL: HTTP/2 error code: PROTOCOL_ERROR Received Rst Stream after exactly 5 minutes. Which looks like ALB has dropped the connection after 5 minutes.

Any idea how we can keep idle connection alive ?

Rajat Goyal
  • 465
  • 1
  • 5
  • 20
  • Is there a maxConnectionAge setting configured for 5 minutes on the AWS? – user675693 Jan 26 '22 at 23:29
  • What the Idle timeout of ALB attribute? – zangw Mar 17 '22 at 04:36
  • The default value of minTime of EnforcementPolicy is 5 minutes. Maybe it is related – zangw Mar 21 '22 at 12:22
  • Hi, did you find a solution for the problem? I have the exact same issue... – Asen Valchev Mar 22 '22 at 15:37
  • No. but I kept it alive by sending health rpc at some regular interval. This way the channel is alive even if there is no actual activity on channel. – Rajat Goyal Mar 25 '22 at 10:03
  • @AsenValchev I also observed he solution mentioned by me in above comment works only in-case there is no response from server. If server sends some response, the connection get dropped by LB after 5 mins. How did you solve the problem ? – Rajat Goyal May 16 '22 at 16:17

1 Answers1

2

We can't use raw http2 pings as ALB doesn't support it HTTP2 PING frames over AWS ALB (gRPC keepalive ping).

I fixed above with small implementation both at client and server side :

 a) Client sends some dummy request to server every 1 min. This is a actual request defined in proto buf by passing some type like dummy request.

   b) On reception of every such above request, server responds back with a dummy response, which client ignores based on request type like dummy response.

This way cycle is whole complete and LB thinks that there is some activity on the http connection and it doesn't drop the connection.

Rajat Goyal
  • 465
  • 1
  • 5
  • 20