1

Recently I was given a situation:

  1. Events are being published to 1 Topic
  2. Given we have 1 consumer in 1 consumer group.
  3. In order to keep the pace of produced message and consumeed message we have running 10 instances of coneumer on 10 different machine.

Rephrasing with Given data, Lets say we have 1 consumer which is producing Events at the rate on 10 Thousand/Second to a Topic which has 1 Partition. And we have 1 consumer Group and Which 1 consumer BUT we have 10 instances of the same consumer on 10 Machine in order to meet the consumption(As one consumer can consume only 1 Thousand/Second) and to increase the performance at consumer side.

I was asked that, we can't increase the consumer in consumer group[ till here it sounds sensible as since we have only one partition so no point of increasing consumer in the group ] so we are running 1 consumer on muliple instances.

Partition: P0, Consumer Group: G1, Consumer in Consumer Group : C1 G1, Instance Machine I1, Consumer on instance: <C1 G 1 I1>

Producer --> P0 --> G1[ { C1 G1 I1} , {C1 G 1 I2}...,....{C1 G1 I10}]

Question: 1. How we will insure that each instance is not getting the same records?

Question: 2. How we will make sure of the order?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Sharad
  • 435
  • 1
  • 5
  • 18

1 Answers1

1

As of kafka topic architecture, message ordering is guaranteed on the partition level, not on the entire topic.

So if you have a multi-partitions topic and a multi-threaded consumer group, then the order will be only guaranteed on a consumer thread basis, not the entire group.

As each thread is taking 1 or more partitions (depends on how many partitions vs consumer threads), so each thread only aware of the messages within its partitions, nothing more.

I recommend you to go through the below resources for in-depth details about the consumer groups and ordering guarantee

Karim Tawfik
  • 1,286
  • 1
  • 11
  • 21
  • Rephrasing with Given data, Lets say we have 1 consumer which is producing Events at the rate on 10 Thousand/Second to a Topic which has 1 Partition. And we have 1 consumer Group and Which 1 consumer BUT we have 10 instances of the same consumer on 10 Machine in order to meet the consumption(As one consumer can consume only 1 Thousand/Second) and to increase the performance at consumer side. Question: 1. How we will insure that each instance is not getting the same records? Question: 2. How we will make sure of the order? – Sharad Feb 28 '23 at 19:03
  • 2
    @Sharad since you have 1 partition topic, then, it doesn't make any benefit to have a consumer group hooked to that topic with more than 1 thread, other threads will be totally idle., if you mean you will have 10 consumers each within a separate consumer group, then in this case, each and every single consumer will read the entire data from the beginning independently, no parallelization here. – Karim Tawfik Feb 28 '23 at 20:10
  • That's why it is important to understand the consumer group mode of operation along with partitioning. – Karim Tawfik Feb 28 '23 at 20:14
  • I was asked that, we can't increase the consumer in consumer group[ till here it sounds sensible as since we have only one partition so no point of increasing consumer in the group ] so we are running 1 consumer on muliple instances. Partition: P0, Consumer Group: G1, Consumer in Consumer Group : C1 G1, Instance Machine I1, Consumer on instance: Producer --> P0 --> G1[ ,.......] – Sharad Feb 28 '23 at 20:24
  • 2
    if I understand your setup correctly, then only 1 thread (machine) from the entire group (G1) will be consuming from the topic (e.g. C1 G 1 l1), all other threads will be idle, there is no room for the rest of the 9 machines to parallelize, since one thread hooked on a partition, no other threads on the same group can compete with it. hope this clarify. – Karim Tawfik Feb 28 '23 at 20:44
  • Yes and Thanks, This make complete sense to me. However I have given the same answer but he who asked me this was't satisfied with it. Anyways Thanks again. – Sharad Mar 01 '23 at 03:27
  • 1
    @Sharad, Good to know, If you see my reply is answering your question, then please mark it answered, so that others can benefit from the Q/A – Karim Tawfik Mar 01 '23 at 07:37