1

We have configured the metrics for application in Prometheus and grafana I am getting alerts intermittently and that too for shorter duration in which I am unable to capture the Error which caused the metrics to go down. In the meanwhile if I check in Prometheus when the alert comes it used to be fine and all services will be up and running. So I am unable to see the exact error what is causing the system to go down so how can I implement a script to capture that error from Prometheus which all parameters I need to include for that script.

Rohankumar
  • 11
  • 2

0 Answers0