3

Been rolling around the web and forums, cannot find a resource on this.

What I am to achieve is create an alert for when there is no change in data for a period of time.

We are monitoring openfiles for our webserver/s so this number fluctuates rather often. Noticed that when the number is stagnant it points to an issue on the server. So what we want is if openfile remains X for 2minutes alert us.

Julian Kriel
  • 57
  • 1
  • 6

3 Answers3

4

If you do use Prometheus and Alert manager, There is a nice function that worked for me.

changes

So using something like this in Alert manager will trigger if no changes for the time interval changes(metric_name[5m]) = 0

Jade.G
  • 76
  • 3
  • Very nice. This worked perfectly for me, although I needed to use a reducer as well to make the alert work. Thanks for this. – Joe Steele Apr 18 '23 at 00:36
3

I made such an alert through a small succession of things:

  1. I have an exclusive 'alerting dummy board', for all the alerts, since I can only have one alert per graph (grafana version 6.6.0)
  2. I use the following query: avg_over_time(delta(Sensor_Data[1m])[20s:]) - this calculates the 20s average of 'first_value-last_value of 1min interval'
  3. My data gathering program feeds into prometheus and this in turn into grafana -- if this program freezes, it might continue sending the last value to prometheus, and the above query will drop to strictly zero.
  4. so I have an alert which goes off if the above query is within a range (-0.01, 0.01) for a minute (a typical value of the above query with system running is abs(query) > 0.18)

Thus, Grafana sends an alert if the Sensor_Data value does not change within about 2-3 minutes.

bklebel
  • 66
  • 5
0

This has worked for me. Make sure you're using a rate or increase function (no change means it will drop to zero) and filter the query like the following:

increase(metric_name) > 0

Then, in Alert Config, set "If no data or all values are null" to "Alerting". That way, when there's no data, the alert will be triggered.

parliamentowl
  • 314
  • 2
  • 11
  • That is a whole different thing. You are using data that by itself is an indicator of change. The question is about arbitrary metrics stagnating at any value. – bugmenot123 Feb 21 '20 at 17:00