0

I am writing a Kafka consumer and for learning purpose, this time I thought of using Spring-Kafka implementation. Till now, I was using Java API for writing the consumers.

I want to manage the offset manually, so I was searching for something similar to ConsumerRebalanceListener in Spring-Kafka package. And to my success, I came across ConsumerAwareRebalanceListener in Spring which can be used instead of ConsumerRebalanceListener.

But when I looked at the ConsumerAwareRebalanceListener interface, could see 2 methods - onPartitionsRevokedBeforeCommit and onPartitionsRevokedAfterCommit which is not available in Kafka java API.

Please can someone explain how/where I can use this methods ?

P.S - Had a look at Spring-Kafka implementation, but didn't quite understand where it would be useful.

Akhil
  • 498
  • 2
  • 6
  • 22

1 Answers1

1

Spring kafka has a message-driven consumer model; you provide a POJO message listener and the framework performs the poll and passes the message to the listener, either one-at-a-time or in a batch.

It has various modes for committing offsets (it prefers turning off enable.auto.commmit in the client).

There are two modes for manual acks AckMode.MANUAL and AckMode.MANUAL_IMMEDIATE; with these modes, we pass an Acknowledgment object to the listener bean and you call ack.acknowledge().

When the mode is MANUAL_IMMEDIATE, as long as you call the acknowledge() on the consumer thread the consumer is called directly.

When the mode is MANUAL, the offset is added to an internal queue and the commits will be done at the end of processing the results of the poll.

Similarly, there are several "auto" ack modes; the main being RECORD and BATCH where the container commits the offsets when the listener exits normally. In record mode, the commit is sent after each record is processed, in batch mode, the commits are done after all the results of the poll are handled.

Committing offsets in batches is more efficient, but increases the risk of duplicate deliveries.

We also commit any pending offsets when a rebalance occurs.

So, why the two onPartitionsRevoked* methods?

When using MANUAL, BATCH, or one of the other AckModes that might have pending offsets to commit, onPartitionsRevokedBeforeCommit() is called before those pending offsets are committed and onPartitionsRevokedAfterCommit() is called after those offsets are committed.

So, consumer.position() may return different results in each method.

Most people will be interested in onPartitionsRevokedAfterCommit() but we felt we should provide both options.

If you use AckMode.MANUAL_IMMEDIATE or AckMode.RECORD, there should be no difference since there will be no pending acks.

However, since the listener is called on the consuming thread, during a poll, there will really only be a difference when using one of the time-based or count-based AckModes. With the other ackmodes we will already have committed the offsets.

Hope that's clear.

Gary Russell
  • 166,535
  • 14
  • 146
  • 179
  • Thanks for the detailed answer Gary! I am using BatchAcknowledgingConsumerAwareMessageListener for consuming records and AckMode.MANUAL option for managing the offset. Just to be clear, in MANUAL or Batch processing mode, the programmer don't have to worry about committing pending offsets in the event of rebalancing ? Does the framework will handle committing pending offset in the event of rebalancing ? Please note, in my Listener class, I am calling ack.acknowledge() after all the messages(which are received as part of last poll) are successfully proceeded. – Akhil Apr 10 '19 at 20:27
  • I am maintaining a map of offsets for each of the (topic,partition) combination inside my listener and the same isused in onPartitionsRevokedBeforeCommit method to commit any processed offsets (using consumer.commitSync(map)). – Akhil Apr 10 '19 at 20:27
  • The answer for above comments can be found here - https://stackoverflow.com/questions/55627430/spring-kafka-concurrency-property/55633777#55633777 – Akhil Apr 11 '19 at 16:37