Questions tagged [prometheus]

The Prometheus monitoring system, including the server, alertmanager, push gateway, exporters, client libraries and other components.

Prometheus is a go-based open-source monitoring system inspired by Google's approach to monitoring.

Prometheus itself is a time-series storage server that periodically pulls metrics from monitored entities. These metrics can then be queried and alerted-upon using simple query and alert languages.

Prometheus comes with a suite of supporting tools like alertmanager, exporters and client libraries, designed to work together in providing a highly scalable and complete monitoring solution. Metric visualizations is usually done with Grafana, which seamlessly connects with Prometheus.

Prometheus's official site.

Read more about Prometheus's approach here:

  • This article explains Prometheus metric gathering approach for beginners.

  • This chapter from the online book "Site Reliability Engineering" describes Google's monitoring system Borgmon. Prometheus is mentioned in this chapter and was designed with Google's approach in mind.

6591 questions
31
votes
2 answers

How dangerous are high-cardinality labels in Prometheus?

I'm considering exporting some metrics to Prometheus, and I'm getting nervous about what I'm planning to do. My system consists of a workflow engine, and I'd like to track some metrics for each step in the workflow. This seems reasonable, with a…
Mark
  • 11,257
  • 11
  • 61
  • 97
31
votes
7 answers

Most recent value or last seen value

Prometheus is built around returning a time series representation of metrics. In many cases, however, I only care about what the state of a metric is right now, and I'm having a hard time figuring out a reliable way to get the "most recent" value of…
Cory Klein
  • 51,188
  • 43
  • 183
  • 243
30
votes
4 answers

Why there are both counters and gauges in Prometheus if gauges can act as counters?

When deciding between Counter and Gauge, Prometheus documentation states that To pick between counter and gauge, there is a simple rule of thumb: if the value can go down, it is a gauge. Counters can only go up (and reset, such as when a…
Jose Armesto
  • 12,794
  • 8
  • 51
  • 56
30
votes
3 answers

Prometheus (in Docker container) Cannot Scrape Target on Host

Prometheus running inside a docker container (version 18.09.2, build 6247962, docker-compose.xml below) and the scrape target is on localhost:8000 which is created by a Python 3 script. Error obtained for the failed scrape target…
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830
29
votes
3 answers

Get total and free disk space using Prometheus

I try to get Total and Free disk space on my Kubernetes VM so I can display % of taken space on it. I tried various metrics that included "filesystem" in name but none of these displayed correct total disk size. Which one should be used to do…
Uliysess
  • 579
  • 1
  • 8
  • 19
29
votes
3 answers

How to display all metrics that don't have a specific label

I want to select all metrics that don't have label "container". Is there any possibility to do that with prometheus query?
cristi
  • 2,019
  • 1
  • 22
  • 31
27
votes
5 answers

How to rename label within a metric in Prometheus

I have a query: node_systemd_unit_state{instance="server-01",job="node-exporters",name="kubelet.service",state="active"} 1 I want the label name being renamed (or replaced) to unit_name ONLY within the node_systemd_unit_state metric. So, desired…
Konstantin Vustin
  • 6,521
  • 2
  • 16
  • 32
27
votes
5 answers

How to gracefully avoid divide by zero in Prometheus

There are times when you need to divide one metric by another metric. For example, I'd like to calculate a mean latency like…
Yoory N.
  • 4,881
  • 4
  • 23
  • 28
26
votes
2 answers

How to add https url on target prometheus

I want to add my HTTPS target URL to Prometheus, an error like this appears: "https://myDomain.dev" is not a valid hostname" my domain can access and run using proxy pass Nginx with port 9100(basically I made a domain for node-exporter) my…
Inadrawiba
  • 599
  • 1
  • 6
  • 13
26
votes
3 answers

Filter prometheus results by metric value, not by label value

Because Prometheus topk returns more results than expected, and because https://github.com/prometheus/prometheus/issues/586 requires client-side processing that has not yet been made available via https://github.com/grafana/grafana/issues/7664, I'm…
Steve Dwire
  • 385
  • 1
  • 3
  • 9
26
votes
3 answers

Generating range vectors from return values in Prometheus queries

I have a metric varnish_main_client_req of type counter and I want to set up an alert that triggers if the rate of requests drops/raises by a certain amount in a given time (e.g. "Amount of requests deviated in the last 2 min!"). Using the deriv()…
Paul Voss
  • 705
  • 1
  • 6
  • 18
26
votes
6 answers

How to scrape all metrics from a federate endpoint?

We have a hierachical prometheus setup with some server scraping others. We'd like to have some servers scrape all metrics from others. Currently we try to use match[]="{__name__=~".*"}" as a metric selector, but this gives the error parse error at…
tex
  • 2,051
  • 1
  • 23
  • 27
25
votes
1 answer

Prometheus - exclude 0 values from query result

I'm displaying Prometheus query on a Grafana table. That's the query (Counter metric): sum(increase(check_fail{app="monitor"}[20m])) by (reason) The result is a table of failure reason and its count. The problem is that the table is also showing…
nirsky
  • 2,955
  • 3
  • 22
  • 35
24
votes
2 answers

prometheus doesn't match regex query

I'm trying to write a prometheus query in grafana that will select visits_total{route!~"/api/docs/*"} What I'm trying to say is that it should select all the instances where the route doesn't match /api/docs/* (regex) but this isn't working. It's…
ninesalt
  • 4,054
  • 5
  • 35
  • 75
24
votes
2 answers

Prometheus - Aggregate and relabel by regex

I currently have the following Promql query which allow me to query the memory used by each of my K8S pods: sum(container_memory_working_set_bytes{image!="",name=~"^k8s_.*"}) by (pod_name) The pod's name is followed by a hash defined by…
Mornor
  • 3,471
  • 8
  • 31
  • 69