1

I have a kubernetes cluster that runs some legacy containers (windows containers) . To simplify , let's say that the container can handle max 5 requests at a time something like

handleRequest(){
   requestLock(semaphore_Of_5)
   sleep(2s)
   return "result"
}

So the cpu is not spiked . I need to scale based on nr of active connections

I can see from the documentation https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-iptables

You can use Pod readiness probes to verify that backend Pods are working OK, so that kube-proxy in iptables mode only sees backends that test out as healthy. Doing this means you avoid having traffic sent via kube-proxy to a Pod that’s known to have failed.

So there is a mechanism to make pods available for routing new requests but it is the livenessProbe that actually mark the pod as unhealthy and subject to restart policy. But my pods are just busy. They don't need restarting.

How can I increase the nr of pods in this case ?

Ovidiu Buligan
  • 2,784
  • 1
  • 28
  • 37
  • scaling pods by no of active connections ? Do you want to increase the no of pods ? – Ankit Deshpande May 29 '20 at 08:15
  • yes, I edited the question I know having this type of problem is not good but let's suppose I need to scale this way – Ovidiu Buligan May 29 '20 at 08:17
  • no of connections ? – Ankit Deshpande May 29 '20 at 08:32
  • yes , scale by number of connection , a pod can only handle 5 connections max at a time and each request takes 2 seconds . If there are 4 connection open for a pod , it can handle one more at that time. As soon as pod with 5 connections closes a connection , it can start receiving another one – Ovidiu Buligan May 29 '20 at 08:50
  • 1
    I think [this case](https://stackoverflow.com/questions/59532268/scaling-gke-pods-based-on-number-of-active-connections-per-pod) might be helpful for you. OP uses HPA with custom metric for the nginx ingress. – acid_fuji May 29 '20 at 11:12

1 Answers1

1

You can enable HPA for the deployment.

You can autoscale on the no of requests metrics and perform autoscaling on this metric.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-metrics-not-related-to-kubernetes-objects

I would also recommend to configure liveness probe failureThreshold and timeoutSeconds, check if it helps.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes

Ankit Deshpande
  • 3,476
  • 1
  • 29
  • 42
  • https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-more-specific-metrics – Ankit Deshpande May 29 '20 at 08:44
  • THe first link solves autoscaling by nr of connections . But the second link is somewhat the oposite of what i want , I want to not route trafic to a pod that has 5 active connections and also not kill it – Ovidiu Buligan May 29 '20 at 09:44
  • You will have to configure probes such that if a pod is serving say 5 requests/connections and cannot accept anymore new connections, the probes should not fail / timeout. – Ankit Deshpande May 29 '20 at 10:54
  • Can you share some information around the use case that you are trying to solve ? – Ankit Deshpande May 29 '20 at 10:55
  • how are these metrics that are configured setup? is that example supposed to work out of the box? – Vincent Gerris Nov 28 '20 at 22:35