2

Say I have two metrics in prometheus, both counters:

  • requests_processed_total
  • requests_failed_total

They both have a matching service label. Example:

requests_processed_total{service="news"} 1097
requests_processed_total{service="store"} 487
requests_failed_total{service="news"} 23
requests_failed_total{service="store"} 89

How to query the requests_failed_total, but only for services whose request_processed_total > 1000.

I'm expecting the following response:

requests_failed_total{service="news"} 23

# Note that the "store" service is excluded
aspyct
  • 3,625
  • 7
  • 36
  • 61

3 Answers3

3

If you are using Grafana you can do the following:

(1) Create a dashboard

(2) Click on Dashboard settings > Variables > New

(3) Create a variable with the following:

Name        = service
Type        = Query

Data source = Prometheus
Query       = query_result(request_processed_total>5)
Regex       = /service="(.*)"/

(4) Use the "service" variable to show the "requests_failed_total" metrics in any panel (you can also use the "repeat for " Grafana feature.

0

You can use the HTTP API to do this.

The following command will find services with request_processed_total>1000:

curl --silent --user USER:PASS --globoff --request GET "https://PROMETHEUS-SERVER/query?query=request_processed_total>1000" | jq --raw-output '.data.result[].metric.service'

And the following command will show requests_failed_total for a given service:

curl --silent --user USER:PASS --globoff --request GET "https://PROMETHEUS-SERVER/query?query=request_failed_total{service=\"SERVICE\"}" | jq --raw-output '.data.result[].value[1]'

So if you take both you get what you want:

for s in $(curl --silent --user USER:PASS --globoff --request GET "https://PROMETHEUS-SERVER/query?query=request_processed_total>1000" | jq --raw-output '.data.result[].metric.service')
do
    curl --silent --user USER:PASS --globoff --request GET "https://PROMETHEUS-SERVER/query?query=request_failed_total{service=\"$s\"}" | jq --raw-output '.data.result[] | .metric.service + " " + .value[1]'
done
  • I guess that could be a solution, but I can't use that for two reasons: 1. it's going to make a lot of requests (lots of different label values) and 2. I'm using grafana to execute the request. – aspyct Jul 28 '20 at 15:41
  • You didn't have mentioned Grafana. I added another answer using Grafana now. – Marcelo Ávila de Oliveira Jul 28 '20 at 17:53