0

I'm currently trying to monitor the EKS Node group status, sometimes my node groups show degraded and I want a CloudWatch alert whenever the status is in a Degraded state, I checked CloudWatch Metrics there are no standard metrics, and even I'm unable to find the event in Cloud trail,

enter image description here

Is there any possibility's to creating the alarm using AWS Cloud trail events, Event bridge, or CloudWatch Kindly help to find the solution for this

sachin_ur
  • 2,375
  • 14
  • 27

2 Answers2

1

I think you can combine Lambda & CloudWatch & EventBridge service here to implement your simple health-check status for a single or multiple node groups.

For your health check Lambda function:

  1. We create a Lambda with Python3 (3.9 for example)
  2. We describe the node group using Boto3
  3. We put a custom metric to CloudWatch metrics so if the status is Active, we put 1 else 0.

When we have the function ready, we prepare the every 1 minutes (up to you) setup.

  1. We create an EventBridge (EB) rule with every 1 min triggers
  2. The EB rule destination is the Lambda function

Once we have enough data points from CloudWatch metrics, we can create a CloudWatch alarm to help us notifying to E-mail or others.

References:

Binh Nguyen
  • 1,891
  • 10
  • 17
0

For CloudWatch, please take a looks at this:

https://docs.aws.amazon.com/de_de/AmazonCloudWatch/latest/monitoring/deploy-container-insights-EKS.html

  • This will guide me how to use container insights , but there are no metrics available for my requirement , is there any other solution can be implemented @Christoph Fischer – sachin_ur Oct 21 '22 at 11:38
  • In a previous project, we used Prometheus+Grafana to get metrics from the EKS cluster. – Christoph Fischer Oct 21 '22 at 12:22