How to make k8s cpu and memory HPA work together?

Question

I'm using a k8s HPA template for CPU and memory like below:

---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: {{.Chart.Name}}-cpu
  labels:
    app: {{.Chart.Name}}
    chart: {{.Chart.Name}}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{.Chart.Name}}
  minReplicas: {{.Values.hpa.min}}
  maxReplicas: {{.Values.hpa.max}}
  targetCPUUtilizationPercentage: {{.Values.hpa.cpu}}
---
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: {{.Chart.Name}}-mem
  labels:
    app: {{.Chart.Name}}
    chart: {{.Chart.Name}}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {{.Chart.Name}}
  minReplicas: {{.Values.hpa.min}}
  maxReplicas: {{.Values.hpa.max}}
  metrics:
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageValue: {{.Values.hpa.mem}}

Having two different HPA is causing any new pods spun up for triggering memory HPA limit to be immediately terminated by CPU HPA as the pods' CPU usage is below the scale down trigger for CPU. It always terminates the newest pod spun up, which keeps the older pods around and triggers the memory HPA again, causing an infinite loop. Is there a way to instruct CPU HPA to terminate pods with higher usage rather than nascent pods every time?

I see both your HPA's are tar-getting the same deployment. Why not just use a single HPA with metrics for both CPU & Memory? — rock'n rolla, Mar 26 '21 at 09:04
Memory and CPU autoscaling HPA have different apiVersions @rock'nrolla — Ankit Sethi, Mar 26 '21 at 09:16
@rock'nrolla Thanks for the suggestion, it didn't occur to me before to just use autoscaling apiVersion v2beta2 for cpu as well. It solved my issue. — Ankit Sethi, Mar 26 '21 at 09:38

score 4 · Answer 1 · answered Aug 31 '22 at 11:46

Autoscaling based on multiple metrics/Custom metrics:-

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: nginx
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: nginx
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Resource
    resource:
      name: memory
      target:
        type: AverageValue
        averageValue: 100Mi

When created, the Horizontal Pod Autoscaler monitors the nginx Deployment for average CPU utilization, average memory utilization, and (if you uncommented it) the custom packets_per_second metric. The Horizontal Pod Autoscaler autoscales the Deployment based on the metric whose value would create the larger autoscale event.

https://cloud.google.com/kubernetes-engine/docs/how-to/horizontal-pod-autoscaling#kubectl-apply

score 3 · Accepted Answer · answered Mar 26 '21 at 09:42

3

As per the suggestion in comments, using a single HPA solved my issue. I just had to move CPU HPA to same apiVersion as memory HPA.

answered Mar 26 '21 at 09:42

Ankit Sethi

51
1
7

Could you please paste your yaml for the HPA? Also, does the metrics trigger work in "OR" condition or "AND" condition. i.e. CPU x% and mem y Mi ? CPU x% or memy Mi ? – Arun Prakash Nagendran Aug 17 '22 at 14:39

How to make k8s cpu and memory HPA work together?

2 Answers2

Autoscaling based on multiple metrics/Custom metrics:-

Linked