2

We have a Spring Boot application on GKE with auto-scaling (HPA) enabled. During startup, HPA kicks in and start scaling the pods even though there is no traffic. Result of 'kubectl get hpa' shows high current CPU average utilization while CPU utilization of nodes and PODs is quite low. The behavior is same during scaling up, and multiple Pod are created eventually leading to Node scaling.

Application deployment Yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp-api
  labels:
    app: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      serviceAccount: myapp-ksa
      containers:
      - name: myapp
        image: gcr.io/project/myapp:126
        env:
        - name: DB_USER
          valueFrom:
            secretKeyRef:
              name: my-db-credentials
              key: username
        - name: DB_PASS
          valueFrom:
            secretKeyRef:
              name: my-db-credentials
              key: password
        - name: DB_NAME
          valueFrom:
            secretKeyRef:
              name: my-db-credentials
              key: database
        - name: INSTANCE_CONNECTION
          valueFrom:
            configMapKeyRef:
              name: connectionname
              key: connectionname
        resources:
          requests:
            cpu: "200m"
            memory: "256Mi"
        ports:
        - containerPort: 8080
        livenessProbe:
          httpGet:
            path: /actuator/health
            port: 8080
          initialDelaySeconds: 90
          periodSeconds: 5
        readinessProbe:
          httpGet:
            path: /actuator/health
            port: 8080
          initialDelaySeconds: 60
          periodSeconds: 10
      - name: cloudsql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.17
        env:
          - name: INSTANCE_CONNECTION
            valueFrom:
              configMapKeyRef:
                name: connectionname
                key: connectionname
        command: ["/cloud_sql_proxy",
                  "-ip_address_types=PRIVATE",
                  "-instances=$(INSTANCE_CONNECTION)=tcp:5432"]
        securityContext:
          runAsNonRoot: true
          runAsUser: 2
          allowPrivilegeEscalation: false
        resources:
          requests:
            memory: "128Mi"
            cpu:    "100m"

Yaml for HPA:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myapp-api
  labels:
    app: myapp
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myapp-api
  minReplicas: 1
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 80
  - type: Resource
    resource:
      name: memory
      target: 
        type: Utilization
        averageUtilization: 80

result of various commands:

$ kubectl get pods

kubectl get pods

$ kubectl get hpa

kubectl get hpa

$ kubectl top nodes

kubectl top nodes

$ kubectl top pods --all-namespaces

kubectl top pods --all-namespaces

matt_j
  • 4,010
  • 1
  • 9
  • 23
  • You might need to take a look on these two hpa flags to delay the scaling readiness. Spring Boot startup will take some time and that's probably causing the CPU spike and triggering the HPA. I am not too sure if GKE supports these two flags but you might as well give it a try: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ – CaioT Aug 18 '21 at 12:23
  • Oh the flags are: --horizontal-pod-autoscaler-initial-readiness-delay and --horizontal-pod-autoscaler-cpu-initialization-period – CaioT Aug 18 '21 at 12:23
  • Check but these flags are not available in GKE I believe. Having said that, why does "kubectl get hpa" reflect high current value even after an hour application has settled without traffic. – Muktesh Arya Aug 18 '21 at 15:55
  • Got the solution. Sounds silly, but I was reading metrics wrong the whole time. it is memory that was getting triggered. My Bad. Thanks for chipping in. – Muktesh Arya Aug 19 '21 at 04:26

1 Answers1

0

As the problem has already been resolved in the comments section, I decided to provide a Community Wiki answer just for better visibility to other community members.

The kubectl get hpa command from the question shows the high current memory utilization which causes the Pods scaling.

Reading the TARGETS column from the kubectl get hpa command can be confusing:
NOTE: The value of 33% applies to memory or cpu ... ?

$ kubectl get hpa app-1
NAME    REFERENCE          TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
app-1   Deployment/app-1   33%/80%, 4%/80%   2         5         5          5m37s

I recommend using the kubectl describe hpa <HPA_NAME> command with grep to determine the current metrics without any doubts:

$ kubectl describe hpa app-1 | grep -A 2 "Metrics:"
Metrics:                                                  ( current / target )
  resource memory on pods  (as a percentage of request):  33% (3506176) / 80%
  resource cpu on pods  (as a percentage of request):     4% (0) / 80%
matt_j
  • 4,010
  • 1
  • 9
  • 23