1

I'm wanting to know how Kafka would handle this situation. A consumer has come across a poison pill message, and is not committing past it. No one notices for a long time (15 days). The retention period on the topic is (7 days). Let's say that this poison pill is in a log segment file that has satisfied the requirements to be deleted by the retention period.

What happens?

  1. Does Kafka allow this log segment file to be deleted while a Consumer actively trying to read from it?
  2. Does Kafka delete the log segment file and leave the Consumer scrambling trying to figure out where to start reading from by using the auto.offset.reset setting?
twindham
  • 930
  • 9
  • 31
  • 1
    It is option 2. This is why monitoring consumer lag in your stream processing pipeline is really important. Otherwise you may _loose data_ in the sense that the data was not processed by your consuming application. – user152468 Nov 05 '20 at 08:04

1 Answers1

2

It'll be option 2 and you can find logs on the consumer instances that indicate it's seeking to the beginning/end, or will fail if auto offset reset = none saying that the offset is out of range

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245