0

I am running a standalone Apache Spark cluster in a Kubernetes environment.

There is a need to export metrics to Prometheus and then finally display them in Grafana.

I found that installing a Graphite exporter was the simplest solution to do this since I am experiencing some trouble with getting all Spark metrics while only using the JMX exporter.

The thing I am having trouble with is to create mappings from graphite output to output that is parsable by prometheus templating.

For instance, I want to be able to parse

app_20200120105608_0736_0_executor_threadpool_completeTasks

so that it matches to something similar to this:

- match: '*.*.threadpool.*.*'
  name: app_data
  labels:
    application: $1 // app_20200120105608_0736
    executor_id: $2 // 0
    type: $3 // threadpool
    qty: $4 // completeTasks

I am not convinced that this will be the best solution overall so any other suggestions are welcome (For instance how this could be done in a proper way with using a JMX exporter only while also getting Spark app data.)

Toto
  • 89,455
  • 62
  • 89
  • 125
toerq
  • 117
  • 2
  • 10

1 Answers1

2

If I understand correctly you try to build something like Spark -> Graphite -> Prometheus -> Grafana. Avoid to do that since Graphite adds overhead to your monitoring system.

You have several options available:

  • Query Graphite directly from Grafana with Graphite data source
  • Setup Jmx Exporter properly. You can refer the discussion to get the idea on how to do it with jmx-exporter. Also I can help you to deal with the errors you have if you share the problems you have with it.
  • Setup Prometheus Push Gateway and the corresponding Spark Prometheus sink. Note that this solution is advised for short running jobs. If you have long running jobs the Jmx Exporter is preferable.

Hope it helps.

  • I am looking for being able to automatically get all data collected from all applications. With the JMX exporter, you are required to ha a unique port for each "spark-submit"? That will not work for me. I also tried Graphite -> Grafana using both the docker images graphiteapp/graphite-statsd and prom/graphite-exporter but both of them give "HTTP Error Internal Server Error" when trying to connect from Grafana. The only somewhat working option so far has been Graphite -> Prometheus -> Grafana unfortunately. Can you give any hints on if I am doing anything wrong so far? Thanks! – toerq Jan 23 '20 at 08:49
  • 1
    I assume that you have predefined range of ports available for JMX exporter. So you can create the list of `Service`s 1 per each port with the same port names. Then create 1 PrometheusOperator `ServiceMonitor` targeting all of them by the the same label. Prometheus will collect metrics for all the endpoints which are currently accessible. That way you'll monitor all the jobs. Or again you still can use PushGateway. Maybe you can check better what you do wrong with connecting Grafana to Graphite. 1 more option for you to try - is native Spark on Kubernetes execution. – Aliaksandr Sasnouskikh Jan 23 '20 at 12:00
  • I am currently looking into "PushGateway". What is the reasoning behind "Note that this solution is advised for short running jobs."? Why is it not well suited for long-running jobs? – toerq Jan 24 '20 at 07:45
  • It is well explained in the [Prometheus docs](https://prometheus.io/docs/practices/pushing/). Refer it please. – Aliaksandr Sasnouskikh Jan 24 '20 at 08:33