I have Django Channels (with Redis) served by Daphne, running behind Nginx ingress controller, proxying behind a LB, all setup in Kubernetes. The Websocket is upgraded and everything runs fine... for a few minutes. After between 5-15min (varies), my daphne logs (set in -v 2 to debug) show:
WARNING dropping connection to peer tcp4:10.2.0.163:43320 with abort=True: WebSocket ping timeout (peer did not respond with pong in time)
10.2.0.163 is the cluster IP address of my Nginx pod. Immediately after, Nginx logs the following:
[error] 39#39: *18644 recv() failed (104: Connection reset by peer) while proxying upgraded connection [... + client real IP]
After this, the websocket connection is getting wierd: the client can still send messages to the backend, but the same websocket connection in Django channels does not receive group messages anymore, as if the channel had unsubscribed from the group. I know my code works since everything runs smoothly until the error gets logged but I'm guessing there is a configuration error somewhere that causes the problem. I'm sadly all out of ideas. Here is my nginx ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
acme.cert-manager.io/http01-edit-in-place: "true"
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.org/websocket-services: "daphne-svc"
name: ingress
namespace: default
spec:
tls:
- hosts:
- mydomain
secretName: letsencrypt-secret
rules:
- host: mydomain
http:
paths:
- path: /
backend:
service:
name: uwsgi-svc
port:
number: 80
pathType: Prefix
- path: /ws
backend:
service:
name: daphne-svc
port:
number: 80
pathType: Prefix
Configured according to this and this. Installation with helm:
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install ngingress ingress-nginx/ingress-nginx
Here is my Django Channels consumer:
class ChatConsumer(AsyncWebsocketConsumer):
async def connect(self):
user = self.scope['user']
if user.is_authenticated:
self.inbox_group_name = "inbox-%s" % user.id
device = self.scope.get('device', None)
added = False
if device:
added = await register_active_device(user, device)
if added:
# Join inbox group
await self.channel_layer.group_add(
self.inbox_group_name,
self.channel_name
)
await self.accept()
else:
await self.close()
else:
await self.close()
async def disconnect(self, close_code):
user = self.scope['user']
device = self.scope.get('device', None)
if device:
await unregister_active_device(user, device)
# Leave room group
if hasattr(self, 'inbox_group_name'):
await self.channel_layer.group_discard(
self.inbox_group_name,
self.channel_name
)
"""
Receive message from room group; forward it to client
"""
async def group_message(self, event):
message = event['message']
# Send message to WebSocket
await self.send(text_data=json.dumps(message))
async def forward_message_to_other_members(self, chat, message, notification_fallback=False):
user = self.scope['user']
other_members = await get_other_chat_members(chat, user)
for member in other_members:
if member.active_devices_count > 0:
#this will send the message to the user inbox; each consumer will handle it with the group_message method
await self.channel_layer.group_send(
member.inbox.group_name,
{
'type': 'group_message',
'message': message
}
)
else:
#no connection for this user, send a notification instead
if notification_fallback:
await ChatNotificationHandler().send_chat_notification(chat, message, recipient=member, author=user)