2

I have a service metric that returns either some positive value, or 0 in case of failure. I want to count how many seconds my service was failing during some time period.

E.g. the expression:

service_metric_name == 0

gives me a dashed line in Grafana:

line_of_downtime

Is there any way to count how many seconds my service was down for the last 2 hours?

1 Answers1

1

I assume the service is either 0 for being down or 1 for being up.

In this case you can calculate an average over a time range. If the result is 0.9, your service has been up for 90% of the time. If you calculated the average over an hour, this would have been 6 minutes downtime out of 60 minutes.

avg_over_time(up{service_metric_name[1h])

This will be a moving average, that is: when your service is down, the value will slowly decrease. Then your service is up, it will slowly increase again.

ahus1
  • 5,782
  • 24
  • 42