Questions tagged [prometheus]

The Prometheus monitoring system, including the server, alertmanager, push gateway, exporters, client libraries and other components.

Prometheus is a go-based open-source monitoring system inspired by Google's approach to monitoring.

Prometheus itself is a time-series storage server that periodically pulls metrics from monitored entities. These metrics can then be queried and alerted-upon using simple query and alert languages.

Prometheus comes with a suite of supporting tools like alertmanager, exporters and client libraries, designed to work together in providing a highly scalable and complete monitoring solution. Metric visualizations is usually done with Grafana, which seamlessly connects with Prometheus.

Prometheus's official site.

Read more about Prometheus's approach here:

  • This article explains Prometheus metric gathering approach for beginners.

  • This chapter from the online book "Site Reliability Engineering" describes Google's monitoring system Borgmon. Prometheus is mentioned in this chapter and was designed with Google's approach in mind.

6591 questions
37
votes
4 answers

Different Prometheus scrape URL for every target

Every instance of my application has a different URL. How can I configure prometheus.yml so that it takes path of a target along with the host name? scrape_configs: - job_name: 'example-random' # Override the global default and scrape…
poojabh
  • 415
  • 1
  • 4
  • 9
36
votes
2 answers

Prometheus endpoint of all available metrics

I was curious concerning the workings of Prometheus. Using the Prometheus interface I am able to see a drop-down list which I assume contains all available metrics. However, I am not able to access the metrics endpoint which lists all of the scraped…
Tony.H
  • 631
  • 1
  • 6
  • 14
36
votes
8 answers

Getting error "Get http://localhost:9443/metrics: dial tcp 127.0.0.1:9443: connect: connection refused"

I'm trying to configure Prometheus and Grafana with my Hyperledger fabric v1.4 network to analyze the peer and chaincode mertics. I've mapped peer container's port 9443 to my host machine's port 9443 after following this documentation. I've also…
Kartik Chauhan
  • 2,779
  • 5
  • 28
  • 39
35
votes
3 answers

What's the difference between Prometheus and Zabbix?

What are the differences between Prometheus and Zabbix?
The One
  • 2,261
  • 6
  • 22
  • 38
34
votes
3 answers

How to use the selected period of time in a query?

I'm using Grafana with Prometheus and I'd like to build a query that depends on the selected period of time selected in the upper right corner of the screen. Is there any variable (or something like that) to use in the query field? In other words,…
Facundo Chambo
  • 3,088
  • 6
  • 20
  • 25
34
votes
4 answers

Monitoring log files using some metrics exporter + Prometheus + Grafana

I need to monitor very different log files for errors, success status etc. And I need to grab corresponding metrics using Prometheus and show in Grafana + set some alerting on it. Prometheus + Grafana are OK I already use them a lot with different…
JosMac
  • 2,164
  • 1
  • 17
  • 23
34
votes
4 answers

How can I visualize a histogram with Promdash or Grafana?

I'm attracted to prometheus by the histogram (and summaries) time-series, but I've been unsuccessful to display a histogram in either promdash or grafana. What I expect is to be able to show: a histogram at a point in time, e.g. the buckets on the…
TvE
  • 1,016
  • 1
  • 11
  • 19
33
votes
10 answers

Relabel instance to hostname in Prometheus

I have Prometheus scraping metrics from node exporters on several machines with a config like this: scrape_configs: - job_name: node_exporter static_configs: - targets: - 1.2.3.4:9100 - 2.3.4.5:9100 -…
Norrius
  • 7,558
  • 5
  • 40
  • 49
33
votes
3 answers

Prometheus: grouping metrics by metric names

Is there a way to group all metrics of an app by metric names? A portion from a query listing all metrics for an app (i.e. {app="bar"}) : ch_qos_logback_core_Appender_all_total{affiliation="foo",app="bar",…
naimdjon
  • 3,162
  • 1
  • 20
  • 41
33
votes
7 answers

Is there a way to monitor kube cron jobs using prometheus

Is there a way to monitor kube cronjob? I have a kube cronjob which runs every 10mins on my cluster. Is there a way to collect metrics every time my cronjob fails due to some error or notify when my cronjob has not been completed after a certain…
user3587892
  • 463
  • 1
  • 5
  • 11
33
votes
4 answers

How can I alert for container restarted?

I like to monitor the containers using Prometheus and cAdvisor so that when a container restart, I get an alert. I wonder if anyone have sample Prometheus alert for this.
qingsong
  • 715
  • 3
  • 7
  • 13
33
votes
3 answers

Prometheus vs ElasticSearch. Which is better for container and server monitoring?

ElasticSearch is a document store and more of a search engine, I think ElasticSearch is not good choice for monitoring high dimensional data as it consumes lot of resources. On the other hand prometheus is a TSDB which is designed for capturing high…
Aditya C S
  • 653
  • 1
  • 8
  • 17
32
votes
2 answers

Monitor custom kubernetes pod metrics using Prometheus

I am using Prometheus to monitor my Kubernetes cluster. I have set up Prometheus in a separate namespace. I have multiple namespaces and multiple pods are running. Each pod container exposes a custom metrics at this end point, :80/data/metrics . I…
32
votes
2 answers

Why does increase() return a value of 1.33 in prometheus?

We graph a timeseries with sum(increase(foo_requests_total[1m])) to show the number of foo requests per minute. Requests come in quite sporadically - just a couple of requests per day. The value that is shown in the graph is always 1.3333. Why is…
James
  • 11,654
  • 6
  • 52
  • 81
31
votes
3 answers

increase() in Prometheus sometimes doubles values: how to avoid?

I've found that for some graphs I get doubles values from Prometheus where should be just ones: Query I use: increase(signups_count[4m]) Scrape interval is set to the recommended maximum of 2 minutes. If I query the actual data stored: curl -gs…
sanmai
  • 29,083
  • 12
  • 64
  • 76