1

Our production Kafka cluster runs on 3 nodes of kafka.m5.large type on AWS MSK. Some of our topics have 36 partitions with a replication factor of 3.

We often notice that due to some unknown reason, the count of Under Replicated Partitions grow up and never be able to catch up until a manual rolling restart of the brokers is done. Sometimes, even restarting the brokers does not solve this issue.

What could be the root cause of this issue and how can it be prevented? I assume this is not due to some faulty consumer group that reads from the topic?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
aveek
  • 188
  • 6
  • Consumer groups don't affect broker replication. There could be many reasons, such as slow disks, slow network, or just high volume of data and low thresholds for reporting URPs – OneCricketeer Oct 10 '22 at 17:13

0 Answers0