1

I created a script to test and get the CPU usage in thousands (ex. 50m=50/1000 CPU core utilization) using rate() and then get the sum to compare the two results:

window_sec=120
curl -s -G "http://master.com:30355/api/v1/query_range" \
    --data-urlencode "query=rate(container_cpu_usage_seconds_total{namespace='arka-ingress', \
    pod='pod', container=''}[${window_sec}s]) * 1000" \
    --data-urlencode "start=$(($(date +%s)-window_sec))" \
    --data-urlencode "end=$(date +%s)" \
    --data-urlencode "step=15s" | jq

curl -s -G "http://master.com:30355/api/v1/query" \
    --data-urlencode "query=avg(rate(container_cpu_usage_seconds_total{namespace='arka-ingress', \
    pod='pod', container=''}[${window_sec}s])) * 1000" | jq

Here is the output:

{
  "status": "success",
  "data": {
    "resultType": "matrix",
    "result": [
      {
        "metric": {
          "beta_kubernetes_io_arch": "amd64",
          "beta_kubernetes_io_os": "linux",
          "cpu": "total",
          "feature_node_kubernetes_io_network_sriov_capable": "true",
          "id": "/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod0c5de2b8_2dea_4dd6_9a5c_bef29bc01c35.slice",
          "instance": "master.com",
          "job": "kubernetes-nodes-cadvisor",
          "kubernetes_io_arch": "amd64",
          "kubernetes_io_hostname": "master.com",
          "kubernetes_io_os": "linux",
          "namespace": "ingress",
          "node_openshift_io_os_id": "rhcos",
          "pod": "pod"
        },
        "values": [
          [
            1685838435,
            "3.0482985848760746"
          ],
          [
            1685838450,
            "3.0482985848760746"
          ],
          [
            1685838510,
            "3.604911126207486"
          ]
        ]
      }
    ]
  }
}



{
  "status": "success",
  "data": {
    "resultType": "vector",
    "result": []
  }
}

The data from the second response should be the average of the values in the result, but it's empty. When I increase the window, window_sec, I do get a response but it's not the average and the value is incorrect. I know the first query with rate() is correct, did I write the second query incorrect?

mberge
  • 57
  • 5
  • On the first glance your second query looks fine, not sure why it result is empty. But I have to warn you that you might misunderstood what `avg` does (or I misunderstood you wording). Please checkout this: [max vs. max_over_time](https://stackoverflow.com/a/76009728/21363224). Same applies to avg. – markalex Jun 04 '23 at 06:02
  • The output of the first query command gives 3 values: `3.0482985848760746`, `3.0482985848760746`, `3.604911126207486`. The output of the second command should be the average of these 3 values which should be `3.23383609865`, but the result is empty for some reason. – mberge Jun 04 '23 at 19:42
  • I want just a singe value which is the result of avg() function – mberge Jun 04 '23 at 20:04
  • The min() and max() aggregate functions also return incorrect values when applied to the 3 values I showed from the example above. – mberge Jun 04 '23 at 20:19
  • In fact, I just found out min(), max(), and avg() are all returning the same values which shouldn't be happening. – mberge Jun 04 '23 at 20:28
  • If you'd read link from my first comment, you'll understand why. Based on your description you need to use `avg_over_time`. – markalex Jun 04 '23 at 20:33
  • I read through and used avg_over_time but cannot for the life of me get a valid reponse. When I run `max_over_time((100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100))[7d:])` from the example you sent I don't even get a response when running in the Prometheus dashboard. I have modified my query to be like the example but keep getting an error status from the reponse. – mberge Jun 04 '23 at 23:41
  • It should work. See the [demo](https://prometheus.demo.do.prometheus.io/graph?g0.expr=max_over_time((100%20-%20(avg%20by(instance)%20(rate(node_cpu_seconds_total%7Bmode%3D%22idle%22%7D%5B2m%5D))%20*%20100))%5B7d%3A%5D)&g0.tab=1&g0.stacked=0&g0.range_input=1h&g1.expr=100%20-%20(avg%20by(instance)%20(rate(node_cpu_seconds_total%7Bmode%3D%22idle%22%7D%5B2m%5D))%20*%20100)&g1.tab=0&g1.stacked=0&g1.range_input=1h). Investigate step-by-step where your query breaks. – markalex Jun 05 '23 at 06:58
  • I'm using container_cpu_usage_seconds_total which can be used to get the CPU usage of a specific pod. The demo uses node_cpu_seconds_total metric and they aren't interchangeable. – mberge Jun 05 '23 at 07:30

1 Answers1

0

The avg() function returns the average value across multiple time series at the specified timestamp passed via time query arg to /api/v1/query. This is so called aggregate function. That's why it returns "unexpected" results.

The rate function returns the average per-second increase rate per every selected counter over the specified lookbehind window in square brackets. For example, rate(container_cpu_usage_seconds_total[1h]) returns the average CPU usage per each container over the last hour.

PromQL provides min_over_time and max_over_time functions, which can be used for calculating the minimum and the maximum values per each selected time series on the given lookbehind window in square brackets. For example, the following query returns the minimum across the last 12 average 5-minute CPU usage calculations over the last hour:

min_over_time(rate(container_cpu_usage_seconds_total[5m])[1h:5m])

This query uses subquery feature, which isn't trivial to understand and to use properly.

P.S. there is an alternative Prometheus-like monitoring system I work on, which provides rollup_rate() function, which returns min, avg and max per-second increase rates on the given lookbehind window. See this article for details.

valyala
  • 11,669
  • 1
  • 59
  • 62
  • This is great, thank you. Yeah, I'm having a difficult time understanding how the two different time windows work from this query: `rate(container_cpu_usage_seconds_total[5m])[1h:5m]` – mberge Jun 07 '23 at 18:58
  • I'm just confused because when I run `rate(container_cpu_usage_seconds_total[${window}:${step}]` and `min_over_time(rate(container_cpu_usage_seconds_total[${window}])[${window}:${step}])` the result I get from the second query is incorrect and not even a number that is output from the result of the first query. Is the way I'm doing window and step correct? I'm just using the same for both queries. – mberge Jun 07 '23 at 21:42
  • The `rate(container_cpu_usage_seconds_total[d])` query takes into account all the raw samples stored in Prometheus for the `container_cpu_usage_seconds_total` metric on the given lookbehind window `(t-d ... t]`, where `t` is the `time` value passed to `/api/v1/query`. The `rate(container_cpu_usage_seconds_total[d:step])` query takes into account `1+step/d` *calculated* samples on the lookbehind window `(t-d..t]` at timestamps `t-d, t-t+step, t-d+2*step, ...`. These points are calculated according to [these docs](https://prometheus.io/docs/prometheus/latest/querying/basics/#staleness). – valyala Jun 08 '23 at 00:04
  • Note also that Prometheus may return unexpected results from subqueries if the inner lookbehind window is smaller than `2*scrape_interval`, where `scrape_interval` is the interval between raw samples belonging to a single [time series](https://docs.victoriametrics.com/keyConcepts.html#time-series) because of [this issue](https://github.com/prometheus/prometheus/issues/3746) – valyala Jun 08 '23 at 00:08