0

when i use flink-metrics-prometheus_2.11-1.10.0.jar to report metrics to prometheus, i got following errors, i want to know the reason for this problem and how can i solve it:

    2020-04-20 15:32:17.940 [Flink-MetricRegistry-thread-1] WARN  org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter  - Failed to push metrics to PushGateway with jobName oceanus, groupingKey {}.
java.io.IOException: Response code from http://9.91.161.72:80/metrics/job/oceanus was 200
        at org.apache.flink.shaded.io.prometheus.client.exporter.PushGateway.doRequest(PushGateway.java:297)
        at org.apache.flink.shaded.io.prometheus.client.exporter.PushGateway.push(PushGateway.java:127)
        at org.apache.flink.metrics.prometheus.PrometheusPushGatewayReporter.report(PrometheusPushGatewayReporter.java:109)
        at org.apache.flink.runtime.metrics.MetricRegistryImpl$ReporterTask.run(MetricRegistryImpl.java:441)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
H.Die
  • 7
  • 3

2 Answers2

1

It seems flink-metrics-prometheus-1.10.0.jar contains a really old prometheus client java version (https://github.com/apache/flink/blob/master/flink-metrics/flink-metrics-prometheus/pom.xml#L35 - 0.8.1, https://github.com/apache/flink/blob/release-1.9.1/pom.xml#L125 - 0.3.0 version).

The annoying WARN on "200 OK" response code is fixed on 0.8.0 (https://github.com/prometheus/client_java/blob/parent-0.8.0/simpleclient_pushgateway/src/main/java/io/prometheus/client/exporter/PushGateway.java#L316). And that version is included since Flink 1.11.0 (https://github.com/apache/flink/blob/release-1.11.0/pom.xml#L127).

So the solution is to include flink-metrics-prometheus-1.11.0.jar as part of your classpath.

Do not worry about Flink version incompatibility there because I'm using Flink 1.9.1 with flink-metrics-prometheus-1.11.0.jar and everything is working fine. I guess there was no so many changes on that part between Flink versions

ivansjg
  • 11
  • 1
0

Seems like there was a recent change on how responses are handled.

I suspect you need to match the prometheus jar to the server version or vice versa.

Arvid Heise
  • 3,524
  • 5
  • 11