As the title suggests, GCP-LB or the HAProxy Ingress Controller Service which is exposed as type LoadBalancer is distributing traffic unevenly to HAProxy Ingress Controller Pods.
Setup:
I am running the GKE cluster in GCP, and using HAProxy as the ingress controller.
The HAProxy Service is exposed as a type Loadbalancer with staticIP.
YAML for HAProxy service:
apiVersion: v1
kind: Service
metadata:
name: haproxy-ingress-static-ip
namespace: haproxy-controller
labels:
run: haproxy-ingress-static-ip
annotations:
cloud.google.com/load-balancer-type: "Internal"
networking.gke.io/internal-load-balancer-allow-global-access: "true"
cloud.google.com/network-tier: "Premium"
cloud.google.com/neg: '{"ingress": false}'
spec:
selector:
run: haproxy-ingress
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
- name: https
port: 443
protocol: TCP
targetPort: 443
- name: stat
port: 1024
protocol: TCP
targetPort: 1024
type: LoadBalancer
loadBalancerIP: "10.0.0.76"
YAML for HAProxy Deployment:
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
run: haproxy-ingress
name: haproxy-ingress
namespace: haproxy-controller
spec:
replicas: 2
selector:
matchLabels:
run: haproxy-ingress
template:
metadata:
labels:
run: haproxy-ingress
spec:
serviceAccountName: haproxy-ingress-service-account
containers:
- name: haproxy-ingress
image: haproxytech/kubernetes-ingress
args:
- --configmap=haproxy-controller/haproxy
- --default-backend-service=haproxy-controller/ingress-default-backend
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
- name: stat
containerPort: 1024
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: run
operator: In
values:
- haproxy-ingress
topologyKey: kubernetes.io/hostname
HAProxy ConfigMap:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: haproxy
namespace: haproxy-controller
data:
Problem:
While debugging some other issue, I found out that the traffic on HAProxy pods has uneven traffic distribution. For e.g. one Pods was receiving 540k requests/sec and another Pod was receiving 80k requests/sec.
On further investigation, it was also found that, new Pods which are started don't start receiving traffic for the next 20-30 mins. And even after that, only a small chunk of traffic is routed through them.
Another version of uneven traffic distribution. This doesn't seem to be random at all, looks like a weighted traffic distribution:
Yet another version of uneven traffic distribution. Traffic from one Pod seems to be shifting towards the other Pod.
What could be causing this uneven traffic distribution and not sending traffic to new pods for a large duration of time?