I have created a k8s cluster with kops (1.21.4) on AWS and as per the docs on autoscaler. I have done the required changes to my cluster but when the cluster starts, the cluster-autoscaler pod is unable to schedule on any node. When I describe the pod, I see the following:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m31s (x92 over 98m) default-scheduler 0/4 nodes are available: 1 Too many pods, 3 node(s) didn't match Pod's node affinity/selector.
Looking at the deployment for cluster I see the following podAntiAffinity
:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- cluster-autoscaler
topologyKey: topology.kubernetes.io/zone
weight: 100
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- cluster-autoscaler
topologyKey: kubernetes.com/hostname
From this I understand that it want to prevent running pod on same node which already has cluster-autoscaler running. But that doesn't seem to justify the error seen in the pod status.
Edit: The pod for autoscaler has the following nodeSelectors
and tolerations
:
Node-Selectors: node-role.kubernetes.io/master=
Tolerations: node-role.kubernetes.io/master op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
So clearly, it should be able to schedule on master node too.
I am not sure what else do I need to do to make the pod up and running.