0

I am having Kafka cluster with 3 brokers and 3 zookeeper node running. we have added 4th broker recently. When we bring it as new cluster, few partitions got stored in the 4th broker as expected. Replication factor for all topics is 3 and has each topic has 10 partitions.

Later, Whenever we bring down whole kafka cluster for maintenance activity and bring it back, all topic partitions is getting stored in first 3 brokers and no partition is getting stored in 4th broker. (Note: Due to bug, we had to use new log directory every time kafka is brought up, pretty much like a new cluster)

I can see that all 4 brokers is available in zookeeper (when i do ls /brokers/ids i can see 4 broker ids) but partition is not distributed to 4th broker.

But when i trigger partition reassignment to move few partitions to 4th broker, it worked fine and 4th broker started storing the given partition. Both producer and consumer able to send and fetch data form 4th broker respectively.I cant find the reason why this storage imbalance is happening among kafka brokers. Please share your suggestion.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Rena76
  • 31
  • 8

1 Answers1

0

When we bring it as new cluster, few partitions got stored in the 4th broker as expected.

This should only be expected when you create new topics or expand partitions of existing ones. Topics do not automatically relocate to new brokers

had to use new log directory every time kafka is brought up

That might explain why data is missing. Unclear what bug you're running into, but this step shouldn't be necessary

when i trigger partition reassignment to move few partitions to 4th broker, it worked fine and 4th broker started storing the given partition. Both producer and consumer able to send and fetch data form 4th broker respectively

This is the correct way to expand a cluster, and sounds like it's working as expected.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • I am using kafka 2.5.0 version and running it in k8s. this is the error that get during entire kafka cluster restart (kafka + zookeeper) - kafka.common.InconsistentClusterIdException. https://stackoverflow.com/questions/59592518/kafka-broker-doesnt-find-cluster-id-and-creates-new-one-after-docker-restart. Hence i rename the log directory path to recreate the meta.properties file with latest zookeeper's cluster id. – Rena76 May 25 '21 at 18:44
  • Isn't this behave as a new cluster as there wont be any topic data present in log directory? – Rena76 May 25 '21 at 18:51
  • And when producer send message to new kafka cluster ,topics gets created automatically (auto topic creation enabled) and its partitions should be distributed across all available brokers. in my case, for some reason it is going to first 3 brokers – Rena76 May 25 '21 at 19:07
  • If you're using k8s, you need persistent volume claims and ensure the broker ids are consistent... Strimzi or Confluent's k8s resources should handle that already. If all topics only go to 3 brokers, then the sounds like the extra broker isn't actually healthy or part of the cluster – OneCricketeer May 25 '21 at 21:10