I have a cluster which I recently upgraded from 1.22 to 1.23. On that cluster, I have a mongodb deployed via helm chart (v13.1.3) and some other custom pods. About 2 days after upgrading to 1.23, my mongo started crashing, restarted every 5 minutes or so. The cluster has a single node and I noticed that at the time the crashes started, it was around 100% (it gradually got there over those 2 days apparently). However, the autoscaler did not kick in.
My cluster autoscaler is set up as follows: Node count: 1 Autoscale enabled:
- min: 0
- max: 3
No taints
Why didn't the autoscaler work? I tried to set min=2. Didn't help. Still shows 1 node.
I checked the autoscaler logs but the only errors there are a noScaleDown
event which happened for the past 2 months or so, long before the cluster upgrade and long before the crashes started, so I don't really know if it's related. I'll post the logged reason here anyway:
reason: {
messageId: "no.scale.down.node.pod.kube.system.unmovable"
parameters: [
0: "xxxx"
]
}
There were no noScaleUp
events or any other kind of event
Any ideas what to look for?