1

I'm trying to achieve a transaction in a Kafka Processor to make sure I don't reprocess the same message twice. Given a message (A) I need to create a list of messages that will be produced on another topic in a transaction and i want to commit the original message (A) in the same transaction. From the documentation I found the Producer method sendOffsetsToTransaction which seems to be able to commit an offset in a transaction only if it succeeds. This is the code inside the process() method of my Processor:

    producer.beginTransaction()
    val topicPartition    = new TopicPartition(this.context().topic(), this.context().partition())
    val offsetAndMetadata = new OffsetAndMetadata(this.context().offset())
    val map               = Map(topicPartition -> offsetAndMetadata).asJava
    producer.sendOffsetsToTransaction(map, "consumer-group-id")
    items.foreach(x => producer.send(new ProducerRecord("items_topic", x.key, x.value)))
    producer.commitTransaction()
    throw new RuntimeException("expected exception")

Unfortunatly with this code (that obviously fail on each execution) the processed message (A) is reprocessed each time I re-start the application after the exception.

I manage to make it works adding a +1 to the offset returned by this.context().offset() and redefining the val offsetAndMetadata in this way:

val offsetAndMetadata = new OffsetAndMetadata(this.context().offset() + 1)

Is this the normal behaviour or I'm doing something wrong?

Thank you :)

2 Answers2

5

Your code is correct.

The offsets you commit are the offsets of the messages you want to read next (not the offsets of the messages you did read last).

Compare: https://github.com/apache/kafka/blob/41e4e93b5ae8a7d221fce1733e050cb98ac9713c/streams/src/main/java/org/apache/kafka/streams/processor/internals/StreamTask.java#L346

Dmitry Minkovsky
  • 36,185
  • 26
  • 116
  • 160
Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
  • 1
    Thank you Matthias :) it seems you are right! However if this is the intended behaviour of `sendOffsetsToTransaction` I think the documentation is a bit misleading. – Simone Esposito Sep 14 '17 at 07:26
  • I just double checked the JavaDoc and also compared with `KafkaConsumer#commit` docs -- I agree and will raise a PR to fix it for next release. Thx! – Matthias J. Sax Sep 14 '17 at 08:11
  • @MatthiasJ.Sax: Did you mean "messages that you want to *write* next"? – miguno Sep 14 '17 at 10:37
  • 1
    @MichaelG.Noll It's about the "consumed offsets" part -- the offset you need to commit, are the offsets you want to read next, ie., committed-offset == lastConsumedOffset + 1. KafkaConsumer does document this nicely in JavaDocs of commitSync()/commitAsync(). – Matthias J. Sax Sep 14 '17 at 15:20
1

Instead of adding 1 to the offset you can use

 long newOffset = consumer.position(topicPartition);

This will return the offset of the next record that will be given out. It will be one larger than the highest offset the consumer has seen in that partition

Pranavi Chandramohan
  • 1,018
  • 1
  • 7
  • 10