Kafka partition leader not updated after broker removed

Question

I have Kafka cluster managed by marathon/mesos which had 3 brokers version 0.10.2.1. The docker images are based on wurstmeister/kafka-docker. The broker.id=-1 which is assigned automatically and sequentially at start up and leaders are auto-rebalanced auto.leader.rebalance.enable=true. Clients are version 0.8.2.1.

Zookeeper configuration:

➜ zkCli -server zookeeper.example.com:2181 ls /brokers/ids
[1106, 1105, 1104]

➜ zkCli -server zookeeper.example.com:2181 get /brokers/ids/1104
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},
"endpoints":["PLAINTEXT://host1.mesos-slave.example.com:9092"],
"jmx_port":9999,"host":"host1.mesos-slave.example.com",
"timestamp":"1500987386409",
"port":9092,"version":4}

➜ zkCli -server zookeeper.example.com:2181 get /brokers/ids/1105
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},
"endpoints":["PLAINTEXT://host2.mesos-slave.example.com:9092"],
"jmx_port":9999,"host":"host2.mesos-slave.example.com",
"timestamp":"1500987390304",
"port":9092,"version":4}

➜ zkCli -server zookeeper.example.com:2181 get /brokers/ids/1106
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},
"endpoints":["PLAINTEXT://host3.mesos-slave.example.com:9092"],
"jmx_port":9999,"host":"host3.mesos-slave.example.com",
"timestamp":"1500987390447","port":9092,"version":4}

➜ bin/kafka-topics.sh --zookeeper zookeeper.example.com:2181 --create --topic test-topic --partitions 2 --replication-factor 2
Created topic "test-topic".

➜ bin/kafka-topics.sh --zookeeper zookeeper.example.com:2181 --describe --topic test-topic
Topic:test-topic    PartitionCount:2        ReplicationFactor:2     Configs:
        Topic: test-topic  Partition: 0    Leader: 1106    Replicas: 1106,1104     Isr: 1106
        Topic: test-topic  Partition: 1    Leader: 1105    Replicas: 1104,1105     Isr: 1105

Consumers can consume what producers are outputting.

➜ /opt/kafka_2.10-0.8.2.1 bin/kafka-console-producer.sh --broker-list 10.0.1.3:9092,10.0.1.1:9092 --topic test-topic
[2017-07-25 12:57:17,760] WARN Property topic is not valid (kafka.utils.VerifiableProperties)
hello 1
hello 2
hello 3
...

➜ /opt/kafka_2.10-0.8.2.1 bin/kafka-console-consumer.sh --zookeeper zookeeper.example.com:2181 --topic test-topic --from-beginning
hello 1
hello 2
hello 3
...

Then broker 1104 and 1105 (host1 and host2) go out and another one is coming online, 1107 (host 1), manually using marathon interface

➜ zkCli -server zookeeper.example.com:2181 ls /brokers/ids
[1107, 1106]

➜ zkCli -server zookeeper.example.com:2181 get /brokers/ids/1107
{"listener_security_protocol_map":{"PLAINTEXT":"PLAINTEXT"},
"endpoints":["PLAINTEXT://host1.mesos-slave.example.com:9092"],
"jmx_port":9999,"host":"host1.mesos-slave.example.com",
"timestamp":"1500991298225","port":9092,"version":4}

Consumer still gets messages from the producer but the topics description looks out of date:

Topic:test-topic    PartitionCount:2        ReplicationFactor:2     Configs:
        Topic: test-topic  Partition: 0    Leader: 1106    Replicas: 1106,1104     Isr: 1106
        Topic: test-topic  Partition: 1    Leader: 1105    Replicas: 1104,1105     Isr: 1105

I tried rebalancing the kafka-preferred-replica-election.sh, kafka-reassign-partitions.sh.

➜ $cat all_partitions.json
{
  "version":1,
  "partitions":[
    {"topic":"test-topic","partition":0,"replicas":[1106,1107]},
    {"topic":"test-topic","partition":1,"replicas":[1107,1106]}
  ]
}

➜ bin/kafka-reassign-partitions.sh --zookeeper zookeeper.example.com:2181 --reassignment-json-file all_partitions.json --execute

➜ bin/kafka-reassign-partitions.sh --zookeeper zookeeper.example.com:2181 --reassignment-json-file all_partitions.json --verify

Status of partition reassignment:
Reassignment of partition [test-topic,0] completed successfully
Reassignment of partition [test-topic,1] is still in progress

➜ $cat all_leaders.json
{
  "partitions":[
    {"topic": "test-topic", "partition": 0},
    {"topic": "test-topic", "partition": 1}
  ]
}

➜ bin/kafka-preferred-replica-election.sh --zookeeper zookeeper.example.com:2181 --path-to-json-file all_leaders.json
Created preferred replica election path with {"version":1,"partitions":[{"topic":"test-topic","partition":0},{"topic":"test-topic","partition":1}]}
Successfully started preferred replica election for partitions Set([test-topic,0], [test-topic,1])

The partition leader for partitions 1 is still 1105 which doesn't make any sense:

➜ bin/kafka-topics.sh --zookeeper zookeeper.example.com:2181 --describe --topic test-topic

Topic:test-topic    PartitionCount:2        ReplicationFactor:2     Configs:
        Topic: test-topic   Partition: 0    Leader: 1106    Replicas: 1106,1107     Isr: 1106,1107
        Topic: test-topic   Partition: 1    Leader: 1105    Replicas: 1107,1106,1104,1105   Isr: 1105

Why partition 1 thinks that leader is still 1105 although host2 is not alive?

score 0 · Answer 1 · answered Dec 11 '17 at 16:25

I am facing similar issue with Apache kafka 2.11. Have a cluster of 3 brokers, with topic of partitions = 2 and replication factor = 1. So, my topic's partitions were spread across 2 brokers out of 3.In the midst of producing messages, I manually shut down one of the brokers where one of the partition resided.After considerable time had passed, the leader of the aforesaid partition continued to be shown as -1. ie the partition did not get shifted to the 3rd active and running broker. I had auto.leader.rebalance.enable set to true on all brokers. Also, the Producer client kept on trying to produce to the partition that was on the shut doen broker, and continued failing to produce.

Kafka partition leader not updated after broker removed

1 Answers1