Scale up condition keeps idle pods up

Question

Having a HPA configuration of 50% average CPU

kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10

I found the problem that I have only one pod receiving traffic so the CPU is higher than 50% of request cpu.

Then start auto scaling up new pods, but those sometimes are not receiving yet any traffic, so the cpu consumption is very low.

My expectations was to see those pods that dont use any cpu to be scale down at some point(how much it should take?), but it's not happening, and I believe the reason is, that first condition of one pod cpu use, higher than 50% is forcing to keep those pods up.

What I need is to scale up/down those pods, until they can start receiving traffic, which it depends on in which node they are deployed.

Any suggestion of how to accomplish this issue?

score 1 · Answer 1 · answered Feb 21 '23 at 15:28

HPA CPU Utilization:

The targetCPUUtilizationPercentage of 50 means that if average CPU utilization across all Pods goes up above 50% then HPA would scale up the deployment and if the average CPU utilization across all Pods goes below 50% then HPA would scale down the deployment if the number of replicas are more than 1. This is how it works,

I just checked the code and found that targetUtilization percentage calculation uses resource request. You can refer to below code:

currentUtilization = int32((metricsTotal * 100) / requestsTotal)

Here is the link https://github.com/kubernetes/kubernetes/blob/v1.9.0/pkg/controller/podautoscaler/metrics/utilization.go#L49

There is an official walkthrough focusing on HPA and it's scaling:

Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Walkthrough

Support for configurable scaling behavior

Kubernetes v1.23 [stable] (the autoscaling/v2beta2 API version previously provided this ability as a beta feature) If you use the v2 HorizontalPodAutoscaler API, you can use the behavior field (see the API reference) to configure separate scale-up and scale-down behaviors. You specify these behaviours by setting scaleUp and / or scaleDown under the behavior field. You can specify a stabilization window that prevents flapping the replica count for a scaling target. Scaling policies also let you controls the rate of change of replicas while scaling.

Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Support for configurable scaling behavior

You could use newly introduced fields like behavior and stabilizationWindowSeconds to your workload to your specific needs.

Scale up condition keeps idle pods up

1 Answers1