0

I have a consumer C1 in consumer group G1 which is reading from topic T1 (doing a poll) and in between this poll, there joins another consumer C2 in same group G1 but subscribes to a different topic T2.
What I have observed is, the partition of Topic T1 is revoked and re-assigned to same consumer C1,which is expected because there is no other consumer for this topic. But my question is why would a revoke happen in first place when the other consumer had subscribed to a different topic?

These are the log prints from consumer C1. At this same moment there is a consumer C2 joining and subscribing to topic T2 :

Revoking previously assigned partitions [T1-1, T1-0]
20/05/28 03:19:04 INFO internals.AbstractCoordinator: [Consumer clientId=consumer-171, groupId=G1] (Re-)joining group
20/05/28 03:19:04 INFO internals.AbstractCoordinator: [Consumer clientId=consumer-171, groupId=G1] Successfully joined group with generation 1117184
20/05/28 03:19:04 INFO internals.ConsumerCoordinator: [Consumer clientId=consumer-171, groupId=G1] Setting newly assigned partitions [T1-1, T1-0]
 
Nishu Tayal
  • 20,106
  • 8
  • 49
  • 101
ravi katiyar
  • 11
  • 1
  • 7

1 Answers1

2

When a new consumer (C2) is added to a consumer group, it is not possible for the existing consumer(C1) to determine if a revoke is required or not. This is because the C1 is not aware of the topic/s that the C2 is subscribed to.

Hence, all the partitions from all the existing consumers are revoked. One of the brokers act as a co-ordinator, it works its logic behind the scenes to come up with a valid assignment and then communicates it to consumers of the consumer group. You can read more about it here: https://medium.com/streamthoughts/apache-kafka-rebalance-protocol-or-the-magic-behind-your-streams-applications-e94baf68e4f2

Rishabh Sharma
  • 747
  • 5
  • 9
  • Thanks, is there a way one could avoid this revoking ? I am doing a manual commit for my use case and this revoking is causing trouble – ravi katiyar Jul 30 '20 at 04:55
  • I dont think it is possible to avoid it as the handing is done by the co-ordination manager. If you are so sure that you won't have any overlap on topics, can you explain why are you using the same consumer group id ? – Rishabh Sharma Jul 30 '20 at 05:03
  • 1
    keeping same consumer Id because all my consumer from business logic point serve the same purpose... we will never have overlap on topics that we are sure.. however we have around 1000 topics .. not sure if I should create 1000 consumer group IDs – ravi katiyar Jul 30 '20 at 08:34
  • To me having separate consumer group id make complete sense. Consumer group contains dependent consumers where addition/removal of consumer affects all the other consumers in the group (by design). What you are creating are 1000 consumers for 1000topics which are **independent** to each other. It does not make sense to put them in the same consumer group – Rishabh Sharma Jul 30 '20 at 09:56
  • @RishabhSharma - well you comment is not fully accurate, that depends on the use case, and if load balancing is needed instead of having 1000 doing the same handling, than same consumer group is the way to go. (the most common use case) – Shlomi Cohen Feb 22 '22 at 22:38