1

I am using Kafka 2.0.0.

In the __consumer_offset topic, the most of partitions are 30MB, but some partitions are very big. For example: 1 partition is 15GB, another partition is 250GB, etc.

What could be the problem?

xRobot
  • 25,579
  • 69
  • 184
  • 304

1 Answers1

1

The topic __consumer_offsets stores the latest committed offset for each subscribed TopicPartition of a Kafka Consumer Group. In this topic the ConsumerGroup servers as the key.

Apparently, your ConsumerGroups which fall into the same partition (applying the hash(key) % #partitions logic) are much more active (consuming more messages more frequently) compared to other Consumer Groups.

Michael Heil
  • 16,250
  • 3
  • 42
  • 77
  • Is there a way to have more balanced partitions for that topic? Thanks :) – xRobot Apr 23 '21 at 08:34
  • 2
    The only thing I can imagine would be to have more diverse ConsumerGroup names. Otherwise, I do not see any other option here. If the size becomes too big you could have a look in [kafka __consumer_offsets topic logs rapidly growing in size reducing disk space](https://stackoverflow.com/questions/61956217/kafka-consumer-offsets-topic-logs-rapidly-growing-in-size-reducing-disk-space/61957100#61957100) – Michael Heil Apr 23 '21 at 08:39
  • 1
    Another option is to [Kafka: Delete idle consumer group id](https://stackoverflow.com/questions/64137375/kafka-delete-idle-consumer-group-id) – Michael Heil Apr 23 '21 at 08:40
  • Could be also a problem of kafka 2.0.0? Maybe upgrading the cluster with the new version of kafka could be solution? Thanks :) – xRobot Apr 23 '21 at 08:45
  • 2
    I do not think so. The logic to distribute messages across partitions within a topic did not change over time. – Michael Heil Apr 23 '21 at 08:47
  • I also noticed that some partitions of that topic, contain hundreds of .log .index files instead of only 2. Is it normal? – xRobot Apr 23 '21 at 08:50
  • 1
    Yes, this is normal because you have a different size in the partitions. – Michael Heil Apr 23 '21 at 08:58