2

I'm new to Kafka. During study to kafka, I think monitoring consumer's lag is needed. When I search from google and docs, I found few ways.

  1. Kafka - Prometheus - graphana
  2. kafka - burrow - someDB - graphana
  3. kafka - burrow_stat?(I can't understand what it is..)
  4. kafka - datadog what I want to ask is document says that burrow is for monitoring, can I visualize like graph(dashboard)? without other tools like graphana or kibana or datadog??

I just trying to get less pipeline steps. What should be the simple way to visualize consumer's lag?

BigJ
  • 21
  • 2

3 Answers3

0

If you are doing the setup in an organisation, datadog or prometheus is probably the way to go. You can capture other Kafka related metrics as well. These agents also have integrations with many other tools beside Kafka and will be a good common choice for monitoring.

If you are just doing it for personal POC type of a project and you just want to view the lag, I find CMAK very useful (https://github.com/yahoo/CMAK). This does not have historical data, but provides a good current visual state of Kafka cluster including lag.

Rishabh Sharma
  • 747
  • 5
  • 9
0

Burrow is extremely effective and specialised in monitoring consumer lag.Burrow is good at caliberating consumer offset and more importantly validate if the lag is malicious or not. It has integrations with pagerduty so that the alerts are pushed to the necessary parties.

https://community.cloudera.com/t5/Community-Articles/Monitoring-Kafka-with-Burrow-Part-1/ta-p/245987

What burrow has:

  • Non-threshold based lag monitoring algorithm capable to evaluate potential slow downs.
  • Integration with pagerduty
  • Exporters for prometheus, AppD etc for historical metrics
  • Pluggable UI

If you are looking for quick solution you can deploy burrow followed by the burrow front end https://github.com/GeneralMills/BurrowUI

Shirine
  • 101
  • 4
-1

For cluster wide metrics you can use kafka_exporter (https://github.com/danielqsj/kafka_exporter) which exposes some very useful cluster metrics(including consumer lag) and is easy to integrate with prometheus and visualize using grafana.

glitch99
  • 264
  • 2
  • 7