I am using Kafka Utils to test a certain kafka based messaging system. I want to find out the number of messages in a particular topic without using kafka-console-consumer.sh script. I can't seem to find a KafkaTestUtils based way or any way in java to help me achieve this. None of the answers to other similar questions have helped me yet.
Asked
Active
Viewed 707 times
1 Answers
1
The following should work:
Properties properties = ...
// omitted for the sake of brevity
KafkaConsumer<String,String> consumer = new KafkaConsumer<>(properties);
consumer.subscribe(topic);
consumer.poll(Duration.ofSeconds(10L)); // or some time
AtomicLong count = new AtomicLong();
consumer.endOffsets(consumer.assignment()).forEach( (topicPartition, endOffsetOfPartition) -> {
count.addAndGet(endOffsetOfPartition);
});
// decrement in case of retention as pointed out by Mickael
consumer.beginningOffsets(consumer.assignment()).forEach( (topicPartition, startOffsetOfPartition) -> {
count.set(count.get() - startOffsetOfPartition);
}));
System.out.println(count.get());
You get the end offsets for each partition and add to the count the end offset of each partition since the no. of messages in a topic is equal to the no. of messages in all the partitions of that topic.

Mickael Maison
- 25,067
- 7
- 71
- 68

JavaTechnical
- 8,846
- 8
- 61
- 97
-
Is it not necessary to add `config.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");` ? If the consumer starts after the senders, it might not see all messages in the topic and the count would be wrong – grog Sep 04 '19 at 12:59
-
@grog I omitted all the properties. But, I think we are not consuming, but only getting the end offsets. Is it really important? – JavaTechnical Sep 04 '19 at 13:00
-
I am unsure, that's why I asked, I do a similar test and if I do not put that property, the default value makes it so the consumer does not see any messages, and fails the test. I do however also check the content of the message, so maybe in this case it's not needed, just double checking – grog Sep 04 '19 at 13:04
-
1@grog It worked for me even if the config is not set to `earliest` but `poll` seems to be important though – JavaTechnical Sep 04 '19 at 13:33
-
2Once some records have been deleted due to retention limits, the count you'll get will be incorrect. You need to take `beginningOffsets()` into account. – Mickael Maison Sep 04 '19 at 14:08
-
@MickaelMaison You mean to say, we should subtract beginningOffsets() from endOffsets() for each partition? – JavaTechnical Sep 04 '19 at 14:16
-
@JavaTechnical yes, that will tell you how many records are currently in the topic. – Mickael Maison Sep 04 '19 at 17:27