Questions tagged [prometheus]

The Prometheus monitoring system, including the server, alertmanager, push gateway, exporters, client libraries and other components.

Prometheus is a go-based open-source monitoring system inspired by Google's approach to monitoring.

Prometheus itself is a time-series storage server that periodically pulls metrics from monitored entities. These metrics can then be queried and alerted-upon using simple query and alert languages.

Prometheus comes with a suite of supporting tools like alertmanager, exporters and client libraries, designed to work together in providing a highly scalable and complete monitoring solution. Metric visualizations is usually done with Grafana, which seamlessly connects with Prometheus.

Prometheus's official site.

Read more about Prometheus's approach here:

  • This article explains Prometheus metric gathering approach for beginners.

  • This chapter from the online book "Site Reliability Engineering" describes Google's monitoring system Borgmon. Prometheus is mentioned in this chapter and was designed with Google's approach in mind.

6591 questions
20
votes
1 answer

Forbidden resource in API group at the cluster scope

I am unable to identify what the exact issue with the permissions with my setup as shown below. I've looked into all the similar QAs but still unable to solve the issue. The aim is to deploy Prometheus and let it scrape /metrics endpoints that my…
BentCoder
  • 12,257
  • 22
  • 93
  • 165
20
votes
2 answers

Configure basic_auth for Prometheus Target

One of the targets in static_configs in my prometheus.yml config file is secured with basic authentication. As a result, an error of description "Connection refused" is always displayed against that target in the Prometheus Targets' page. I have…
Kacey Ezerioha
  • 1,068
  • 4
  • 22
  • 46
20
votes
4 answers

Understanding histogram_quantile based on rate in Prometheus

According to Prometheus documentation in order to have a 95th percentile using histogram metric I can use following query: histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket[5m])) by (le)) Source:…
evgeniy44
  • 2,862
  • 7
  • 28
  • 51
20
votes
3 answers

Prometheus export / import data for backup

How do you export and import data in Prometheus? How do you make sure the data is backed up if the instance gets down? It does not seem that there is a such feature yet, how do you do then?
Arkon
  • 2,648
  • 6
  • 26
  • 46
20
votes
2 answers

Graphing slow counters with prometheus and grafana

We graph fast counters with sum(rate(my_counter_total[1m])) or with sum(irate(my_counter_total[20s])). Where the second one is preferrable if you can always expect changes within the last couple of seconds. But how do you graph slow counters where…
James
  • 11,654
  • 6
  • 52
  • 81
20
votes
3 answers

Prometheus how to handle counters on server

I have articles and for each article I want to have read count # TYPE news_read_counter2 Counter news_read_counter2{id="2000"} 168 now the counters on the servers are saved in redis\memcached so they can get reset from time to time so after a while…
Amir Bar
  • 3,007
  • 2
  • 29
  • 47
19
votes
7 answers

Prometheus instant vector vs range vector

There's something I still dont understand about instant vector and range vectors Instant vector - a set of time series containing a single sample for each time series, all sharing the same timestamp. Range vector - a set of time series…
small
  • 223
  • 1
  • 2
  • 6
19
votes
3 answers

Prometheus rate functions and interval selections

I am doing some monitoring with prometheus and is trying to understand how to properly use the rate functions. Premise is this; I have a counter, configuration for this is set to ingest new values every 15s. Now I am trying to graph the per second…
Pelleplutt
  • 191
  • 1
  • 2
  • 4
18
votes
4 answers

prometheus operator - enable monitoring for everything in all namespaces

I want to monitor a couple applications running on a Kubernetes cluster in namespaces named development and production through prometheus-operator. Installation command used (as per Github) is: helm install prometheus-operator…
18
votes
2 answers

How do I get a pod's (milli)core CPU usage with Prometheus in Kubernetes?

I run a v1.9.2 custom setup of Kubernetes and scrape various metrics with Prometheus v2.1.0. Among others, I scrape the kubelet and cAdvisor metrics. I want to answer the question: "How much of the CPU resources defined by requests and limits in my…
Alex
  • 365
  • 1
  • 2
  • 6
18
votes
2 answers

How do you add scrape targets to a Prometheus server that was installed with Kubernetes-Helm?

Background I have installed Prometheus on my Kubernetes cluster (hosted on Google Container Engineer) using the Helm chart for Prometheus. The Problem I cannot figure out how to add scrape targets to the Prometheus server. The prometheus.io site…
18
votes
3 answers

How can I make a Grafana template with a variable reference another variable using Prometheus as a datasource?

I have a Grafana dashboard with template variables for services and instances. When I select a service how can I make it filter the second template variable list based on the first?
checketts
  • 14,167
  • 10
  • 53
  • 82
17
votes
4 answers

Prometheus config doesn't work with Spring boot 2.3.0: ClassNotFoundException: io.micrometer.prometheus.HistogramFlavor

Application was working correctly with version 2.2.6 but as the application is upgraded to latest version of spring boot 2.3.0 it stopped working and fails during startup. 2020-05-20T08:43:04.408+01:00 [APP/PROC/WEB/0] [OUT] 2020-05-20 07:43:04.407…
Krushnat Khavale
  • 416
  • 2
  • 4
  • 14
17
votes
5 answers

Live reload Prometheus configuration in docker(-compose)

I have a new server running Prometheus in docker-compose. I want to be able to re-load the configuration file (prometheus.yml) without have to stop and start the container. Of course since I persist the storage of promethues in a volume the stop…
goldfishalpha
  • 447
  • 2
  • 4
  • 13
17
votes
2 answers

Prometheus how "up" metrics works

I'm looking for information how "up" metrics is calculated by Prometheus up{job="", instance=""}: 1 if the instance is healthy, i.e. reachable, or 0 if the scrape failed. How Prometheus calculate when the instance is…
uszychaha
  • 181
  • 1
  • 1
  • 5