0

We a Kafka producer produces messages to a topic that is consumed by only one group of consumers. The speed of consumers is quite different. We always have a certain amount of consumer lags as a buffer.

Is it possible for the producer to know the lag of each partition of the consumer group and produce messages to the partition with the lowest lag first? And is it possible if I use C/C++ clients or Java clients?

Thanks.

zwush
  • 75
  • 2
  • 9

2 Answers2

1

Its not possible.

You must ensure that the policy you use distributes the work to all partitions equally. You can provide a custom partitioner that suits your needs.

Look for example: https://dzone.com/articles/custom-partitioner-in-kafka-lets-take-quick-tour.

Default partitioner strategy is round robin

1

It is possible, given that you can lookup any consumer group name's lag from within the producer, however as the consumers are running those values will fluctuate and you're coupling yourself to knowing which consumers are running, which would be an anti-pattern in Kafka.

I'd suggest tweaking other consumer settings like poll sizes / frequency

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Thanks for your reply. In fact, I am looking for a kind of way to let the consumer consume any pending messages, not restricted to its assigned partitions. I found the Kafka REST Proxy which seems to serve this purpose. However, the performance of consuming via http is much worse than the native client. And I'm not sure if it's the typical use case of the Proxy. – zwush Jan 13 '20 at 12:53
  • The REST Proxy just wraps the normal native Java consumer. I don't imagine it would solve any issues you're experiencing. – OneCricketeer Jan 13 '20 at 20:49
  • If I create one consumer instance on the Proxy, then this proxy consumer listens to all the partitions. If I let all my consumer clients call the fetch data API on this instance, it looks like the clients can consume from any partition. Am I right? – zwush Jan 14 '20 at 08:02
  • The proxy manages multiple consumer instance threads on its own. You just use the API to create a consumer group, then pull the topic from one endpoint. Which will not scale to multiple instances of your own app – OneCricketeer Jan 17 '20 at 21:10
  • So it is not possible to use the API to pull the same proxy consumer instance from different app instances, right? Thanks – zwush Jan 17 '20 at 22:23
  • In my opinion, the REST Proxy is really only meant to be used where you are unable to write a native consumer. – OneCricketeer Jan 18 '20 at 02:07