I am currently running Kubernetes 1.9.7 and successfully using the Cluster Autoscaler and multiple Horizontal Pod Autoscalers.
However, I recently started noticing the HPA would favour newer pods when scaling down replicas.
For example, I have 1 replica of service A running on a node alongside several other services. This node has plenty of available resource. During load, the target CPU utilisation for service A rose above the configured threshold, therefore the HPA decided to scale it to 2 replicas. As there were no other nodes available, the CAS span up a new node on which the new replica was successfully scheduled - so far so good!
The problem is, when the target CPU utilisation drops back below the configured threshold, the HPA decides to scale down to 1 replica. I would expect to see the new replica on the new node removed, therefore enabling the CAS to turn off that new node. However, the HPA removed the existing service A replica that was running on the node with plenty of available resources. This means I now have service A running on a new node, by itself, that can't be removed by the CAS even though there is plenty of room for service A to be scheduled on the existing node.
Is this a problem with the HPA or the Kubernetes scheduler? Service A has now been running on the new node for 48 hours and still hasn't been rescheduled despite there being more than enough resources on the existing node.