I am trying to understand how hpa works but I have some concerns:
In case my service is set like this:
resources:
limits:
cpu: 500m
memory: 1Gi
requests:
cpu: 250m
memory: 512Mi
and I configure hpa in this way:
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test-service
minReplicas: 3
maxReplicas: 6
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Is it preventing my service to reach the limits (500m), right?
Is it better to configure by putting a higher value like 80%?
I have this doubt because with this configuration I see pods scaled to the maximum number even if they are using less cpu than limits:
NAME CPU(cores) MEMORY(bytes)
test-service-76f8b8c894-2f944 189m 283Mi
test-service-76f8b8c894-2ztt6 183m 278Mi
test-service-76f8b8c894-4htzg 117m 233Mi
test-service-76f8b8c894-5hxhv 142m 193Mi
test-service-76f8b8c894-6bzbj 140m 200Mi
test-service-76f8b8c894-6sj5m 149m 261Mi
The amount of CPU used is less than the request configured in the definition of the service.
Moreover, I have seen that it has been discussed here as well but I didn't get the answer. Using Horizontal Pod Autoscaling along with resource requests and limits