Background:
I have a Spring boot Kafka consumer, and I am trying to monitor it using Prometheus and Grafana. For that, I am using the Spring's inbuild MeterRegistry. The metric I am using for counting total events consumed is kafka_consumer_fetch_manager_records_consumed_total
. The idea is to use a query like this sum(increase(kafka_consumer_fetch_manager_records_consumed_total [$__range])) by (topic)
. The metrics are stored on victoria metrics and queried by Grafana
Quetion:
In this process, I have noticed a strange thing. After I restarted the consumer, the value of the metric is reset and went to 192. Now in this case, if I apply the increase()
function, I expect that the final output is also 192 (since from the start the total event consumed is 192). However with increase()
I get 152. I don't really understand why is that, can someone please help?
Here is the screenshot from Grafana:
The raw values of kafka_consumer_fetch_manager_records_consumed_total
And the values of kafka_consumer_fetch_manager_records_consumed_total
with increase()
PS: While writing this question, I noticed, that when the consumer restarted, the metrics didn't reset to 0 but it started from 40. Could this be the issue? If yes then how could I solve it?
FYI, this is how I am registering Kafka metrics from Spring boot.
DefaultKafkaConsumerFactory<String, Map<String, String>> defaultKafkaConsumerFactory = new DefaultKafkaConsumerFactory<>(config, new StringDeserializer(),
new ErrorHandlingDeserializer<>(new JsonDeserializer<>(Map.class)));
defaultKafkaConsumerFactory.addListener(new MicrometerConsumerListener<>({@Autowired MeterRegistry}));
I tried PromQL increase(sum(kafka_consumer_fetch_manager_records_consumed_total) by (namespace) [$__range])
and surprisingly it produced correct result. But as per this https://www.robustperception.io/rate-then-sum-never-sum-then-rate/ Its not and ideal query to use.