1

I have a kubernetes cluster in Amazon EKS, Autoscaling is set. So when there is load increase a new node spin-up in the cluster and spin-down with respect to load-running. We are monitoring it with Prometheus and send desired alerts with Alertmanager.

So help me with a query that will send alerts whenever Autoscaling is performed in my Cluster.

iemkamran
  • 11
  • 2
  • 1
    how have you configured autoscaling? if it done via Prometheus metric then you can use the same for alerts as well probably. – Krishna Chaurasia Feb 10 '21 at 06:29
  • Thanks for your response, have configured autoscaling in EKS and monitoring our kubernetes cluster via Prometheus. Prometheus query will be helpful. – iemkamran Feb 10 '21 at 11:41

1 Answers1

1

The logic is not so great, but this works for me in a non-EKS Self Hosted Kubernetes Cluster on AWS EC2s.

(group by (kubernetes_io_hostname, kubernetes_io_role) (container_memory_working_set_bytes ) * 0

The above query fetches the currently up nodes and multiplies them by 0,

or group by (kubernetes_io_hostname, kubernetes_io_role) (delta ( container_memory_working_set_bytes[1m]))) == 1

Here, it adds all nodes that existed in the last 1 minute through the delta() function. The default value of the nodes in the delta() function output will be 1, but the existing nodes will be overridden by the value 0, because of the OR precedence. So finally, only the newly provisioned node(s) will have the value 1, and they will get filtered by the equality condition. You can also extract whether the new node is master/worker by the kubernetes_io_role label

Full Query:

(group by (kubernetes_io_hostname, kubernetes_io_role) (container_memory_working_set_bytes ) * 0 or group by (kubernetes_io_hostname, kubernetes_io_role) (delta ( container_memory_working_set_bytes[1m]))) == 1

You can reverse this query for downscaling of nodes, although that will collide with the cases in which your Kubernetes node Shuts Down Abruptly due to reasons other than AutoScaling