I have a kubernetes cluster that runs some legacy containers (windows containers) . To simplify , let's say that the container can handle max 5 requests at a time something like
handleRequest(){
requestLock(semaphore_Of_5)
sleep(2s)
return "result"
}
So the cpu is not spiked . I need to scale based on nr of active connections
I can see from the documentation https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-iptables
You can use Pod readiness probes to verify that backend Pods are working OK, so that kube-proxy in iptables mode only sees backends that test out as healthy. Doing this means you avoid having traffic sent via kube-proxy to a Pod that’s known to have failed.
So there is a mechanism to make pods available for routing new requests but it is the livenessProbe that actually mark the pod as unhealthy and subject to restart policy. But my pods are just busy. They don't need restarting.
How can I increase the nr of pods in this case ?