1

I am working on a Springboot application where it rebalances for different partitions in Kafka topics (eg. 700 topics with 10 paritions each i.e. 7000 partitions). But I want to spin up multiple instances of dockerized app where the application will have all 700 topics names in but it should pick only first 25 partitions and unsubscibe others )

@KafkaListener( topics = "#{kafkaProperties.getTopics()}" )

kafkaProperties.getTopics() returns all 700 topic names

S. Arora
  • 11
  • 5
  • This is not clear; what does "the first 25 partitions" mean when each topic only has 10? Describe your requirement in much more detail. If you mean you want each instance to only get 25 partitions, e.g. `c1 - t1(10), t2(10), t3(5)`, `c2 - t3(5), t4(10), t5(10)`, etc, you will need a custom `ConsumerPartitionAssignor`. – Gary Russell Sep 08 '20 at 20:31
  • @GaryRussell Yes that's exactly what I need, but right now what's happening is there is one thread per topic partition. I referred to this [post](https://stackoverflow.com/questions/62091106/how-can-i-effectively-bind-my-kafkalistener-to-concurrentkafkalistenercontainer) to create one thread per topic-partition but now i want to distribute the partitions in multiple instances that are running and if one instance goes down for any reason the others should pick the load until new instances comes up gain. – S. Arora Sep 09 '20 at 09:39
  • @GaryRussell let me clarify further, there will be multiple instances of the application running eg. multiple jars or dockerized application instance each should be have same configuration i.e. all topics configured but each instance should pick a limited number of threads with one topic/partition each and leave rest for others instances to pick up. Now second instance should pick some from balance topics and so on for the other instances. – S. Arora Sep 09 '20 at 14:49

1 Answers1

1

As I said in my initial comment asking for clarification, for such a partition distribution scheme, you will need to implement a custom ConsumerPartitionAssignor.

each instance should pick a limited number of threads with one topic/partition each

Instances don't "pick" topics/partitions; one instance is selected and it decides which instance(s) get which topic/partition(s).

See its javadocs.

/**
 * This interface is used to define custom partition assignment for use in
 * {@link org.apache.kafka.clients.consumer.KafkaConsumer}. Members of the consumer group subscribe
 * to the topics they are interested in and forward their subscriptions to a Kafka broker serving
 * as the group coordinator. The coordinator selects one member to perform the group assignment and
 * propagates the subscriptions of all members to it. Then {@link #assign(Cluster, GroupSubscription)} is called
 * to perform the assignment and the results are forwarded back to each respective members
 *
 * In some cases, it is useful to forward additional metadata to the assignor in order to make
 * assignment decisions. For this, you can override {@link #subscriptionUserData(Set)} and provide custom
 * userData in the returned Subscription. For example, to have a rack-aware assignor, an implementation
 * can use this user data to forward the rackId belonging to each member.
 */
public interface ConsumerPartitionAssignor {

To add more threads (consumers) to an instancee, increase the container concurrency.

For a large number of instances, I would recommend a COOPERATIVE rebalance protocol.

Gary Russell
  • 166,535
  • 14
  • 146
  • 179