Kubernetes HPA disable scale down

Question

By our product's design we would like to disable the scale down in HPA, can it be disabled?

score 6 · Answer 1 · answered Oct 07 '20 at 16:15

I randomly stumbled upon this post, and it looks like you can disable scaledown. The documentation includes this example at the bottom. Potentially this feature wasn't available when the question was initially asked.

The selectPolicy value of Disabled turns off scaling the given direction. So to prevent downscaling the following policy would be used:

behavior:
  scaleDown:
    selectPolicy: Disabled

score 1 · Answer 2 · answered Feb 13 '20 at 17:00

NO, this is not possible.

1) you can delete HPA and create simple deployment with desired num of pods

2) you can use workaround provided on HorizontalPodAutoscaler: Possible to limit scale down?#65097 issue by user 'frankh':

I've made a very hacky workaround, I have a cronjob that runs every 3 minutes, and sets the minimum replicas on an HPA to $currentReplicas - $downscaleLimit. If anyone feels like using it it's here: https://gist.github.com/frankh/050943c72273cf639886b43e98bc3caa

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hpa-downscale-limiter
  namespace: kube-system
spec:
  schedule: "*/3 * * * *"
  concurrencyPolicy: Forbid
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: hpa-downscale-limiter
          containers:
          - name: kubectl
            image: frankh/k8s-kubectl:1.10.3
            command: ["/bin/bash", "-c"]
            args:
            - |
              set -xeuo pipefail
              namespaces=$(kubectl get hpa --no-headers --all-namespaces | cut -d' ' -f1 | uniq)
              for namespace in $namespaces; do
                hpas=$(kubectl get hpa --namespace=$namespace --no-headers | cut -d' ' -f1)
                for hpa in $hpas; do
                  echo "$(kubectl get hpa --namespace=$namespace $hpa -o jsonpath="{.spec.minReplicas} {.status.desiredReplicas} {.metadata.annotations.originalMinimum} {.metadata.annotations.downscaleLimit}")" > tmpfile
                  read -r minReplicas desiredReplicas originalMinimum downscaleLimit < tmpfile

                  if [ -z "$originalMinimum" ]; then
                    kubectl annotate hpa --namespace=$namespace $hpa originalMinimum="$minReplicas"
                    originalMinimum=$minReplicas
                  fi

                  if [ -z "$downscaleLimit" ]; then
                    downscaleLimit=1
                  fi
                  target=$(( $desiredReplicas - $downscaleLimit ))
                  target=$(( $target > $originalMinimum ? $target : $originalMinimum ))

                  if [ "$minReplicas" -ne "$target" ]; then
                    kubectl patch hpa --namespace=$namespace $hpa --patch="{\"spec\": {\"minReplicas\": "$target"}}"
                  fi
                done
              done
          restartPolicy: OnFailure
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: hpa-downscale-limiter
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: hpa-downscale-limiter-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
  - name: hpa-downscale-limiter
    kind: ServiceAccount
    namespace: kube-system

Checkout the answer by @Howard_Roark https://stackoverflow.com/a/64248150/6245630 — Chinmay Relkar, Mar 04 '21 at 10:35
agree but answer half a year ago. at that moment I havent found relevant info — Vit, Mar 05 '21 at 10:59

score 1 · Answer 3 · answered Feb 13 '20 at 18:12

Horizontal Pod Autoscaler automatically scales the number of pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with beta support you also can use application-provided metrics). From the most basic perspective, the Horizontal Pod Autoscaler controller operates on the ratio between desired metric value and current metric value:

desiredReplicas = ceil[currentReplicas * ( currentMetricValue / desiredMetricValue )]

If multiple metrics are specified in a HorizontalPodAutoscaler, this calculation is done for each metric, and then the largest of the desired replica counts is chosen. However, before the scale recommendation is recorded. The controller considers all recommendations within a configurable window choosing the highest recommendation from within that window. This value can be configured using the --horizontal-pod-autoscaler-downscale-stabilization flag, which defaults to 5 minutes. This means that scaledowns will occur gradually, smoothing out the impact of rapidly fluctuating metric values.

Based on what I've explained, it means that no you can't. However for spiky traffic you can still use: --horizontal-pod-autoscaler-downscale-stabilization flag.

Kubernetes HPA disable scale down

3 Answers3

Linked

Related