2

I have a managed AWS eks cluster and the ec2 instances (k8s nodes) can run in any availability zones (ap-northeast-1a, 1b, 1c, 1d in my case). I have a minio pod running on the cluster and using ebs volume(created automatically) in ap-northeast-1d az. The ec2 instance running in this zone died 2 days ago and a new ec2 instance was created automatically but in a different zone - ap-northeast-1a. Now my pod is not able to schedule on the node because it can not mount the volume (az mismatch).

I am looking for a better solution to tackle this issue.

Here is what I did - I created a snapshot of my existing volume, created a new volume in the same zone through that snapshot. I modified the PV and PVC resources, changed az and volume id and the pods were able to schedule again and are running fine.

I believe this is a dirty solution and therefore, I am looking for a more appropriate solution.

Any suggestions would be appreciated

Akshay Sood
  • 6,366
  • 10
  • 36
  • 59
  • I always found this interaction between the vanilla cluster-autoscaler and stateful sets problematic. I used a 3rd party autoscaler that solved the issue much more easily in the past, and have hopes that AWS's new auto-scaler will also handle this situation better. https://karpenter.sh/ – jordanm Dec 13 '21 at 05:01
  • 1
    If you're not already, you need to run one ASG per subnet with the regular cluster-autoscaler. – jordanm Dec 13 '21 at 05:03

0 Answers0