21

I am new to kafka and trying to understand if there is a way to read messages from last consumed offset, but not from beginning.

I am writing an example case, so that my intention will not get deviate.

Eg:
1) I produced 5 messages at 7:00 PM and console consumer consumed those.
2) I stopped consumer at 7:10 PM
3) I produced 10 message at 7:20 PM. No consumer had read those messages.
4) Now, i have started console consumer at 7:30 PM, without from-beginning.
5) Now, it Will read the messages produced after it has started. Not the earlier ones, which were produced at 7.20 PM

Is there a way to get the messages produced from last consumed offset.?

Srini
  • 3,334
  • 6
  • 29
  • 64

3 Answers3

13

I am new to kafka and trying to understand if there is a way to read messages from last consumed offset, but not from beginning.

Yes, it is possible to use console consumer to read from the last consumed offset. You have to add consumer.config flag while invoking kafka-console-consumer.

Example:-

[root@sandbox bin]# ./kafka-console-consumer.sh --topic test1 --zookeeper localhost:2181 --consumer.config /home/mrnakumar/consumer.properties

Here /home/mrnakumar/consumer.properties is a file containing group.id. Here is how the /home/mrnakumar/consumer.properties looks:-

group.id=consoleGroup

Withoug using consumer.config, it is possible to read either from beginning [ by using --from-beginning] or end of the Log only. End of the Log means all the messages published after consumer start.

mrnakumar
  • 625
  • 6
  • 13
  • 2
    Yes, if we gave any group id, then the data getting read from the last consumed point. If we ran without group id, it is considering only the data after it has started.. Thank you.. – Srini Nov 15 '15 at 09:30
  • Is it also possible to commit offsets via console consumer for a specific consumer group? – clausmc Sep 19 '16 at 13:00
  • 1
    This is not working if `auto.offset.reset=earliest` is not being set in the consumer.properties – STaefi Jul 22 '19 at 12:03
10

Setting the auto.offset.reset=earliest, AND a fixed group.id=something in the consumer config will start the consumer at the last committed offset. In your case it should start consuming at the first message at 7:20. If you want it to start reading messages posted AFTER it starts, then the auto.offset.reset=latest will ignore the 10 messages sent at 7:20 and read any that come in after it starts.

If you want it to start at the beginning, you must either call seekToBeginning after the first consumer.poll(), or change the consumer group ID to something unique.

Giorgos Myrianthous
  • 36,235
  • 20
  • 134
  • 156
George Smith
  • 1,027
  • 4
  • 11
  • 23
  • Just to be sure that I understand well. When you say: "If you want it to start at the beginning, change the consumer group ID to something unique" Does that mean that when we want to read the latest, we have to use a group ID that already exists ? – Özgün Jul 16 '20 at 15:52
  • for me, this starts from the beginning every time as it is supposed to do. – mariq vlahova Oct 07 '21 at 17:22
5

You should set the auto.offset.reset parameter in your consumer config on largest, so it will read all messages after last committed offset.

codejitsu
  • 3,162
  • 2
  • 24
  • 38
  • @Srini you don't have to set this property to a numerical value, the value of this property should be 'largest' in order to consume from the end of your stream. – codejitsu Nov 13 '15 at 05:25
  • worked for me. Thanks! I just added to the consumer.properties file the following line: auto.offset.reset=largest – Ofer Eliassaf Mar 30 '16 at 12:27
  • 1
    can someone explain why largest? earliest is working fine if the consumer is not on when those 10 messages are sent & when consumer is turned on, with auto.offset.reset=earliest its consuming all the 10 messages which was not consumed – driven_spider Dec 20 '19 at 12:46
  • @codejitsu In the latest Kafka docs, there is no `largest` value in `auto.offset.reset` configuration parameter? Ref - https://kafka.apache.org/documentation/#consumerconfigs_auto.offset.reset – Ashik Mydeen Mar 31 '22 at 10:38