1

Spring-boot application exposing the metrics, resilience4j_timelimiter_calls_total, seems to be counter, which is giving me the total timeouts happened till the current time.`

I have used sum by (service, name) (rate(resilience4j_timelimiter_calls_total{service="service-name"}[5m])) to find the number of timeouts happening over 5m.

I want to find total number of failures of the downstream API in say x time-period, including both timeouts and 4xx/5xx exceptions, meaning the number of times the resilience4j circuit breaker directed the call to fallback method.

Is there a promethues exposed resilience metrics to find this ?

tusharRawat
  • 719
  • 10
  • 24

1 Answers1

0

rate calculates the per-second average rate of increase of the time series in the range vector.

Based on your description I believe you actually need increase. It will calculate a total increase of metric over range provided.

You query would be

sum by (service, name) (increase(resilience4j_timelimiter_calls_total{service="service-name"}[5m]))
markalex
  • 8,623
  • 2
  • 7
  • 32
  • How to find the total number of failures of the downstream API over a time period, that includes timeouts as well. Is there a metric for this ? – tusharRawat May 08 '23 at 04:48
  • @tusharRawat, that depends on exporter you're using. I'm not familiar with resilience4j, all I can advise you, is to look into documentation of in, specifically - description of exposed metrics. – markalex May 09 '23 at 10:01