0

I am using HPA(Horizontal Pod Autoscaler) and Karpenter in my AWS EKS cluster to increase the number of pods and provision new nodes when my application encounters high traffic, respectively.

My application is a simple API serving pod, that receives various requests from outside world and handles such traffics appropriately.

I've done some load-testing on my application and encountered the following incident. I will be descriptive as possible.

Currently, there are 5 pods that serve my API. It is because the number of replicas in the Deployment was set to 5. And I've set a HPA to scale it from minimum of 5 pods to maximum of 20 pods. It tries to keep the target cpu utilization of 50%.

Also, I've deployed Karpenter and created a provisioner that provisions and deprovisions specific type of nodes to my cluster.

When I intentionally gave lots of traffics to my application to trigger HPA, it was successfully triggered and created at max 20 pods to ease the traffic.

Since there were not enough resources in the existing nodes to hold all the 20 pods, the Karpenter provisioner was then triggered, and it began to provision several nodes to my cluster. As a result, it provisioned 5 more nodes. The existing nodes and the new nodes were now able to serve all the 20 pods together.

Now, when I remove all the traffic, HPA decreased the number of pods from 20 to 5, since the cpu utilization is now almost 0. Then, since there is no need for extra nodes, the Karpenter provisioner was triggered and it started to deprovision the pre-created nodes. Deprovisioning was activated since I've set consolidation as true in Karpenter provisioner's manifest file.

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: my-provisioner
spec:
  consolidation:
    enabled: true
  ...

However, I found that not all newly provisioned nodes were deprovisioned. One of them was still remaining in the cluster, because one of the 5 pods (there are currently 5 pods in the cluster since the Deployment claimed for 5 replicas) was located on the Karpenter-provisioned node, not on the existing node.

I found this quite a waste of money since I am currently running one extra node that is created by Karpenter provisioner, despite the fact that this node is not necessarily needed; existing nodes can sufficiently serve all the requirements of running 5 pods.

Is there a way to tell HPA and/or Karpenter provisioner which pods to evict first, in case when there is no more need for extra-created pods? I think above circumstance happened since HPA removed one of the pods that were originally existing, rather than from the ones that are newly created by HPA.

Is there a way to tell HPA to delete newly created pods first, in case of de-provisioning the pods?

SHM
  • 61
  • 1
  • 8

1 Answers1

-1

Maybe switch to Fargate/ or have taint( or any other way you want) so that only the pods you want would be scheduled into the karpenter node