How does Kubernetes Horizontal Pod Autoscaler calculate CPU Utilization for Multi Container Pods?

Question

Question 1.)
Given the scenario a multi-container pod, where all containers have a defined CPU request:
How would Kubernetes Horizontal Pod Autoscaler calculate CPU Utilization for Multi Container pods?
Does it average them? (((500m cpu req + 50m cpu req) /2) * X% HPA target cpu utilization
Does it add them? ((500m cpu req + 50m cpu req) * X% HPA target cpu utilization
Does it track them individually? (500m cpu req * X% HPA target cpu utilization = target #1, 50m cpu req * X% HPA target cpu utilization = target #2.)

Question 2.)
Given the scenario of a multi-container pod, where 1 container has a defined CPU request and a blank CPU request for the other containers:
How would Kubernetes Horizontal Pod Autoscaler calculate CPU Utilization for Multi Container pods?
Does it work as if you only had a 1 container pod?

Question 3.)
Do the answers to questions 1 and 2 change based on the HPA API version?
I noticed stable/nginx-ingress helm chart, chart version 1.10.2, deploys an HPA for me with these specs:

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler

(I noticed apiVersion: autoscaling/v2beta2 now exists)

Background Info:
I recently had an issue with unexpected wild scaling / constantly going back and forth between min and max pods after adding a sidecar(2nd container) to an nginx ingress controller deployment (which is usually a pod with a single container). In my case, it was an oauth2 proxy, although I image istio sidecar container folks might run into this sort of problem all the time as well.

apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: nginx-ingress-controller #(primary-container)
          resources:
            requests:
              cpu: 500m    #baseline light load usage in my env 
              memory: 2Gi  #according to kubectl top pods
            limits:
              memory: 8Gi  #(oom kill pod if this high, because somethings wrong)
        - name: oauth2-proxy #(newly-added-2nd-sidecar-container)
          resources: 
            requests:
              cpu: 50m
              memory: 50Mi
            limits:
              memory: 4Gi

I have an HPA (apiVersion: autoscaling/v1) with:

min 3 replicas (to preserve HA during rolling updates)
targetCPUUtilizationPercentage = 150%

It occurred to me that my misconfiguration leads to unexpected wild scaling was caused by 2 issues:

I don't actually understand how HPAs work when the pod has multiple containers
I don't know how to dig deep to get metrics of what's going on.

To address the first issue: I brainstormed my understanding of how it works in the single container scenario (and then realized I don't know the multi-container scenario so I decided to ask this question)

This is my understanding of how HPA (autoscaling/v1) works when I have 1 container (temporarily ignore the 2nd container in the above deployment spec):
The HPA would spawn more replicas when the CPU utilization average of all pods shifted from my normal expected load of 500m or less to 750m (150% x 500m request)

To address the 2nd issue: I found out how to dig to see concrete numeric value-based metrics vs relative percentage-based metrics to help figure out what's happening behind the scenes:

bash# kubectl describe horizontalpodautoscaler nginx-ingress-controller -n=ingress | grep Metrics: -A 1
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  5% (56m) / 100%

(Note: kubectl top pods -n=ingress, showed cpu usage of the 5 replicas as 36m, 34m, 88m, 36m, 91m, so that 57m current which ~matches 56m current)

Also now it's a basic proportions Math Problem that allows solving for target static value:
(5% / 56m) = (100% / x m) --> x = 56 * 100 / 5 = 1120m target cpu
(Note: this HPA isn't associated with the deployment mentioned above, that's why the numbers are off.)

You can look at https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#how-does-the-horizontal-pod-autoscaler-work for the implementation details. — Alassane Ndiaye, Sep 16 '19 at 03:31
The documented implementation details only describe what happens at the pod level. My question is asked specificially because what happens at the container level (multi-container pods) seems to be undocumented. — neoakris, Sep 16 '19 at 09:15

score 4 · Accepted Answer · answered Sep 17 '19 at 11:41

Basing on stackoverflow community member answer in other case

"HPA calculates pod cpu utilization as total cpu usage of all containers in pod divided by total request. I don't think that's specified in docs anywhere, but the relevant code is here"

You have got more informations,with examples in the link above.

Basing on documentation

Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with beta support, on some other, application-provided metrics).

So basically:

apiVersion autoscaling/v1 HPA base on cpu.

apiVersion autoscaling/v2beta2 base on cpu,memory,custom metrics.

More informations here

Link to source code is golden. That's kind of dumb you could have 1GB req + 10mb req, but when it averages it'll use 505mb as the target... That's exactlly why I had wild scaling to max unexpectedly. It seems that if the 2nd container has no request, it'll be ignored / the HPA will treat the container as if it only had 1 request — neoakris, Sep 19 '19 at 01:41

Acuster · Answer 2 · 2020-06-02T15:18:38.653

For question 2 if we look at source code we can see that it looks all containers individually and returns if container doesn't have request.

func calculatePodRequests(pods []*v1.Pod, resource v1.ResourceName) (map[string]int64, error) {
   requests := make(map[string]int64, len(pods))
   for _, pod := range pods {
      podSum := int64(0)
      for _, container := range pod.Spec.Containers {
         if containerRequest, ok := container.Resources.Requests[resource]; ok {
            podSum += containerRequest.MilliValue()
         } else {
            return nil, fmt.Errorf("missing request for %s", resource)
         }
      }
      requests[pod.Name] = podSum
   }
   return requests, nil
}

How does Kubernetes Horizontal Pod Autoscaler calculate CPU Utilization for Multi Container Pods?

2 Answers2

Linked