-1

I have a counter in prometheus

I want to plot it's raw value, but accounted for resets, i.e. if it goes

raw: 0 1 4 6 1  3  4
res: 0 1 4 6 7 10 14
             ^
           reset

And then I want also subtract the value at left point, so I'll get the growth over time in the selected interval

I was able to do this with this expression:

increase(counter[1y]) - (increase(counter[1y] @ start()))

Where 1y is something very large, so the increase will count every point

There are two problems:

  1. It's quite inefficient, and takes some time
  2. It also includes labels for non existent in interval data (i.e. if there was a counter{foo="foo"} long time ago, it will not have any points, but still appear in legend in grafana

I can kinda solve the second problem with

(increase(counter[1y]) - (increase(counter[1y] @ start()))) > 0

It will also filter actual points with zero values, but I can live with it

But this seems like a very basic aggregation, and I'm thinking I'm doing something wrong, but I couldn't figure a better way to count this.

Is there a better way?

UPD:

This is what I want (and have) desired

This is just increase(..[$__range]) only increase with range

This is increase(..[$__range]) - increase(..[$__range] @ start()) increase with range and subtraction

Notice, that in my desired picture all the plots go only up

Masafi
  • 151
  • 2
  • 15

2 Answers2

0

I don't think this is in any form a "very basic aggregation", so there are no clean and simple ways to do what is described.

Your query seems fine. To hide series that are long gone you can use something like this:

increase(counter[1y]) - (increase(counter[1y] @ start())) 
 and last_over_time(counter[$__range]@end())

It will filter out series, that were not present within the time range of the dashboard.

Regarding performance, you yourself proof that no dynamic range is applicable here, and there is a need to use some large predefined constant range selector. Considering how increase works inside Prometheus, it will always take some time to calculate what you need.

markalex
  • 8,623
  • 2
  • 7
  • 32
  • What I want is just to count `increase` based on fixed point for all points, not relative to each other. That sounds simpler, than default `increase` implementation. Using `1y` is important, because if I do `$__range`, this metric can go up and down, because it will basically count `points[ts] - points[ts-range]`, and I want `points[ts]-0`. The last part is true, and I did that, but it isn't really about the main question – Masafi Jul 11 '23 at 17:44
  • What I want - `value[ts] - 0`, which is `value[ts]`, as I said (I want raw counter metric), but with adjustments for resets. As this is counter and my metric can only go up, so is this value. Instead, `increase` will calculate `value[ts]-value[ts-range]`, which can go up and down, but I want to see growth over time. – Masafi Jul 11 '23 at 17:46
  • Another way to say it: I want to `cumsum` the growth (difference/rate) between each point, again, adjusted to resets. But there is no `cumsum` function as I understand – Masafi Jul 11 '23 at 17:48
  • @Masafi, updated answer with the way to filter out old time-series. Don't think any performance improvements are possible. – markalex Jul 15 '23 at 20:57
0

I'm unsure whether Prometheus supports the proper and fast solution for this task, but it can be solved efficiently with the following MetricsQL query:

running_sum(increase(counter))

It works in the following way:

  • increase(counter) calculates increase between adjacent points on the graph per each time series with the name counter. Note that increase() misses square brackets with lookbehind window - in this case MetricsQL automatically uses the step value as lookbehind window. This value is automatically calculated by Grafana depending on the selected time range and the horizontal resolution of the graph. Grafana automatically passes this value to Prometheus datasource API at /api/v1/query_range. This value is also known as $__interval in Grafana.

  • running_sum(...) calculates running sum across per-step increases at the previous step. The running sum is calculated individually per each selected time series. If you want the total sum across all the selected series, then just wrap the query into sum().

Note also that Prometheus may return unexpected results from increase() function because of the chosen data model. For example, it doesn't take into account the increase between the raw sample just before the lookbehind window and the first raw sample inside the lookbehind window. It also can return non-integer results when applied to integer counter because of extrapolation. See this Prometheus issue for details. See also this design doc. MetticsQL is free from these issues.

valyala
  • 11,669
  • 1
  • 59
  • 62