26

I have a metric varnish_main_client_req of type counter and I want to set up an alert that triggers if the rate of requests drops/raises by a certain amount in a given time (e.g. "Amount of requests deviated in the last 2 min!").

Using the deriv() function should work much better than comparing relative values, but it can only be used with gauges. Is it possible to convert an ever increasing metric aka. counter to a rated metric aka. gauge?

Query: deriv(rate(varnish_main_client_req[2m])[5m])

Expectation: Prometheus calculates the rate of client requests over the last 2 mins and returns a derivative of the resulting values over the last 5 mins.

Actual result:

"error": "parse error at char 48: range specification must be preceded by a metric selector, but follows a *promql.Call instead"

Recording rules might be an option but it feels like a cheap workaround for something that should work with queries:

my_gauge_metric = rate(some_counter_metric[2m])

Castrohenge
  • 8,525
  • 5
  • 39
  • 66
Paul Voss
  • 705
  • 1
  • 6
  • 18

3 Answers3

26

Solution

It's possible with the subquery-syntax (introduced in Prometheus version 2.7):

deriv(rate(varnish_main_client_req[2m])[5m:10s])

Warning: These subqueries are expensive, i.e. create very high load on Prometheus. Use recording-rules when you use these queries regularly (in alerts, etc.).

Subquery syntax

<instant_query>[<range>:<resolution>]
  • instant_query: a PromQL-function which returns an instant-vector)
  • range: offset (back in time) to start the first subquery
  • resolution: the size of each of the subqueries.

It returns a range-vector.

In the example above, Prometheus runs rate() (= instant_query) 30 times (the first from 5 minutes ago to -4:50, ..., the last -0:10 to now). The resulting range-vector is input to the deriv()-function.

Another example (mostly available on all Prometheus instances):

deriv(rate(prometheus_http_request_duration_seconds_sum{job="prometheus"}[1m])[5m:10s])

Without the subquery-range ([5m:10s]), you'll get this error-message:

parse error at char 80: expected type range vector in call to function "deriv", got instant vector

Dominik
  • 2,283
  • 1
  • 25
  • 37
10

Yes, you need to use a recording rule for this.

Prometheus calculates the rate of client requests over the last 2 mins and returns a derivative of the resulting values over the last 5 mins.

Herein lies the problem - at what interval should Prometheus synthesise this data?

brian-brazil
  • 31,678
  • 6
  • 93
  • 86
  • Everytime the query is executed, I suppose. I though i could treat the return value of functions the same way I use instant vectors. I'm going to try the recording rule, thanks for your help. – Paul Voss Nov 21 '16 at 14:54
  • 4
    Recording rules is probably the solution but I think you need to explain the issue further. For someone like me who doesn't know the functionality of prom too well, I would ask why I can't I take the "rate" result as a series of points that were recorded every 2 minutes as an instant vector. – dtc Jul 12 '18 at 16:35
  • 2
    It seems like support for subqueries was added after this answer was posted: https://prometheus.io/blog/2019/01/28/subquery-support/#examples – Sid Jan 31 '20 at 09:31
0

Range vector selector (metric_name[4m]), selects ranges of time directly from the TSDB (raw value). Range vector selector cannot be applied on another query (derived value). So query like avg_over_time(rate(metric_name[4m])[4m]) doesnt work. For this you need to use subquery [<duration>:<resolution>]. With this syntax the inner query is executed first with the given resolution. Then the outer query will be executed for the time duration and at given resolution .

However if you dont want to use subquery, then another way is recording rule by which you are actually storing the first result in TSDB and hence you can apply range vector selector on it as below.

Recording rule: abc:metric_name:rate4m = rate(metric_name[4m])

Query: deriv(abc:metric_name:rate4m[4m])

Summary: range vector selector can only be applied to raw value (from TSDB) and cannot be applied on derived value (like result of rate function in this case)

Sudhakar MNSR
  • 594
  • 1
  • 3
  • 17