Question 1.)
Given the scenario a multi-container pod, where all containers have a defined CPU request:
How would Kubernetes Horizontal Pod Autoscaler calculate CPU Utilization for Multi Container pods?
Does it average them? (((500m cpu req + 50m cpu req) /2) * X% HPA target cpu utilization
Does it add them? ((500m cpu req + 50m cpu req) * X% HPA target cpu utilization
Does it track them individually? (500m cpu req * X% HPA target cpu utilization = target #1, 50m cpu req * X% HPA target cpu utilization = target #2.)
Question 2.)
Given the scenario of a multi-container pod, where 1 container has a defined CPU request and a blank CPU request for the other containers:
How would Kubernetes Horizontal Pod Autoscaler calculate CPU Utilization for Multi Container pods?
Does it work as if you only had a 1 container pod?
Question 3.)
Do the answers to questions 1 and 2 change based on the HPA API version?
I noticed stable/nginx-ingress helm chart, chart version 1.10.2, deploys an HPA for me with these specs:
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
(I noticed apiVersion: autoscaling/v2beta2 now exists)
Background Info:
I recently had an issue with unexpected wild scaling / constantly going back and forth between min and max pods after adding a sidecar(2nd container) to an nginx ingress controller deployment (which is usually a pod with a single container). In my case, it was an oauth2 proxy, although I image istio sidecar container folks might run into this sort of problem all the time as well.
apiVersion: apps/v1
kind: Deployment
spec:
replicas: 3
template:
spec:
containers:
- name: nginx-ingress-controller #(primary-container)
resources:
requests:
cpu: 500m #baseline light load usage in my env
memory: 2Gi #according to kubectl top pods
limits:
memory: 8Gi #(oom kill pod if this high, because somethings wrong)
- name: oauth2-proxy #(newly-added-2nd-sidecar-container)
resources:
requests:
cpu: 50m
memory: 50Mi
limits:
memory: 4Gi
I have an HPA (apiVersion: autoscaling/v1) with:
- min 3 replicas (to preserve HA during rolling updates)
- targetCPUUtilizationPercentage = 150%
It occurred to me that my misconfiguration leads to unexpected wild scaling was caused by 2 issues:
- I don't actually understand how HPAs work when the pod has multiple containers
- I don't know how to dig deep to get metrics of what's going on.
To address the first issue: I brainstormed my understanding of how it works in the single container scenario (and then realized I don't know the multi-container scenario so I decided to ask this question)
This is my understanding of how HPA (autoscaling/v1) works when I have 1 container (temporarily ignore the 2nd container in the above deployment spec):
The HPA would spawn more replicas when the CPU utilization average of all pods shifted from my normal expected load of 500m or less to 750m (150% x 500m request)
To address the 2nd issue: I found out how to dig to see concrete numeric value-based metrics vs relative percentage-based metrics to help figure out what's happening behind the scenes:
bash# kubectl describe horizontalpodautoscaler nginx-ingress-controller -n=ingress | grep Metrics: -A 1
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): 5% (56m) / 100%
(Note: kubectl top pods -n=ingress, showed cpu usage of the 5 replicas as 36m, 34m, 88m, 36m, 91m, so that 57m current which ~matches 56m current)
Also now it's a basic proportions Math Problem that allows solving for target static value:
(5% / 56m) = (100% / x m) --> x = 56 * 100 / 5 = 1120m target cpu
(Note: this HPA isn't associated with the deployment mentioned above, that's why the numbers are off.)