Horizontal Pod Autoscaling and resource configuration calibration

Question

I am trying to understand how hpa works but I have some concerns:

In case my service is set like this:

resources:
  limits:
   cpu: 500m
   memory: 1Gi
  requests:
   cpu: 250m
   memory: 512Mi

and I configure hpa in this way:

spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: test-service
  minReplicas: 3
  maxReplicas: 6
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50

Is it preventing my service to reach the limits (500m), right?
Is it better to configure by putting a higher value like 80%?

I have this doubt because with this configuration I see pods scaled to the maximum number even if they are using less cpu than limits:

NAME                                  CPU(cores)   MEMORY(bytes)   
test-service-76f8b8c894-2f944            189m         283Mi           
test-service-76f8b8c894-2ztt6            183m         278Mi           
test-service-76f8b8c894-4htzg            117m         233Mi           
test-service-76f8b8c894-5hxhv            142m         193Mi           
test-service-76f8b8c894-6bzbj            140m         200Mi           
test-service-76f8b8c894-6sj5m            149m         261Mi

The amount of CPU used is less than the request configured in the definition of the service.

Moreover, I have seen that it has been discussed here as well but I didn't get the answer. Using Horizontal Pod Autoscaling along with resource requests and limits

score 3 · Accepted Answer · answered Nov 09 '22 at 20:51

3

Is it preventing my service to reach the limits (500m), right?

No, hpa is not preventing it (althogh resources.limits is). What hpa does is starting new replicas when the average cpu utilization across all pods gets above 50% of requested cpu resources, i.e. above 125m.

Is it better to configure by putting a higher value like 80%?

Can't say, it is application specific.

Horizontal autoscaling is pretty well described in the documentation.

answered Nov 09 '22 at 20:51

Mafor

9,668
2
21
36

while saying preventing I wanted to say that having 50% of resources basically means that (it could happen) all pods belonging to my service are using resources set in the request without reaching limits, so based on that I wanted to configure properly these values – user1971444 Nov 10 '22 at 10:55
1

I see. So yes, with your current settings, autoscaller is practically preventing your pods from reaching limits. But only until maximal number of pods is reached. Then utilization can grow and if it reaches limits, the pod will be either throthelled (cpu limits violation) or killed (memory). – Mafor Nov 11 '22 at 07:04

Horizontal Pod Autoscaling and resource configuration calibration

1 Answers1