I have a Kafka consumer group consuming several topics (each topic has more than one partition). All topics contain a considerable amount of records on each partition. I'm currently trying to make sense of the behavior when the consumer initially starts consuming. In particular, I'd like to know how the broker decides which records reach the client first.
The following aspects are noteworthy:
- There are a lot more records than the consumer can process in one single roundtrip (i.e. more records than the consumer's
max.poll.records
configuration) - There are records from several topics and several partitions that the consumer has to read
- I naively assumed that the broker returns records for each topic in each poll loop, so that the consumer reads all the topics at a similar pace. This doesn't seem to be the case though. Apparently it prioritizes records for a single topic at a time, switching the topic without an obvious pattern (at least that's what I'm seeing in the metrics of my consumer).
I couldn't find anything in the consumer config parameters that allows me to change this behavior. It's not really a problem, because all messages get read eventually. But I would like to understand the behavior in more detail.
So my question is: How does the broker decide which records end up in the result of a consumer's poll loop?