1

I want to know what behaviour of kafka streams would be when (using low level API) auto commit is turned off and application don't do explicit commit?

In case application is restarted (auto commit off, and application itself don't do explicit commit of offsets) will application read from beginning always? What will be behaviour of the application.

CuriousMind
  • 8,301
  • 22
  • 65
  • 134

2 Answers2

2

Kafka Streams automatically sets auto commits to disabled.

What is your auto.offset.reset? If latest, then it'll always start at the latest offset if there is no group for the application.id. Streams isn't special; this is the same logic as any consumer group.

If there is a group, then you start a terminal process (foreach, print, to, etc), then the offset will be committed. And you can enable transactional processing to get exactly once semantics.


Most of this is covered in the docs.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • Thanks for your reply. I didn't get this part: "Kafka Streams automatically sets auto commits to disabled." -> Does every kafka streams turn it off irrespective of what has been specified --> "enable.auto.commit". – CuriousMind May 02 '20 at 20:28
  • 1
    Yes. [Docs say so, at least](https://kafka.apache.org/25/documentation/streams/developer-guide/config-streams#enable-auto-commit) – OneCricketeer May 03 '20 at 03:36
2

Kafka Streams does commit offsets based on commit.interval.ms configs (default is 30 seconds). Thus, even if you request a commit, commits happen regularly. In general, it's sufficient to rely on Kafka Streams' implicit commits (requesting commits explicitly is not necessary for most applications).

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
  • Thanks for your answer. I am asking what behaviour of application would be when we set parameter `enable.auto.commit` is set to false and application isn't doing any commits. will committed offset be undefined? – CuriousMind May 02 '20 at 22:05
  • 2
    `enable.auto.commit` is a consumer config and has no impact on Kafka Streams that will always commit based on `commit.interval.ms`. (In fact, to give Kafka Streams full control when commits happen, `enable.auto.commit` is always set to `false` for the internal consumers Kafka Streams uses and you cannot even enable it.) – Matthias J. Sax May 03 '20 at 01:33
  • Thanks for your answer. So this flag `enable.auto.commit` isn't applicable for kafka streams. Could you please point me to source code where it's overriding this flag for internal client? – CuriousMind May 03 '20 at 16:29
  • 1
    Correct (as documented in the JavaDocs https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L118). The overwrite happens here: https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L1003 – Matthias J. Sax May 03 '20 at 22:10
  • I have posted a new question, if you can answer would be of great help. https://stackoverflow.com/questions/61622414/kafka-stateful-stream-processor-with-statestore-behind-the-scenes – CuriousMind May 05 '20 at 20:35