I'm using Kafka exporter to monitor the Kafka metrics which is then queried in prometheus. I have a Kafka topic with 3 consumer groups, these 3 consumer groups are used by 3 different services. I am trying to write a query to have an alert when either of these consumer group lag increases beyond the average lag.
The query I have so far:
kafka_consumer_group_lag{group_id=~"consumer_group.*"} > avg_over_time(kafka_consumer_group_lag{group_id=~"consumer_group.*"}[5m])
But this doesn't seem to work. I'm not sure how to form the calculation to get this.
Can someone help me in understanding how to form this query? The entire group_id
will not be known, the starting of the group_id
will be consumer_group
hence I'm using the wild card.