we have HDP cluster - 2.6.4 with ambari 2.6.1 version
we have 3 kafka brokers with version 10.1 , and 3 zookeeper servers
we saw in the /var/log/kafka/server.log many errors messages as :
in this example we have 6601 errors lines about:
This server is not the leader for that topic-partition
example
[2019-01-06 14:56:53,312] ERROR [ReplicaFetcherThread-0-1011], Error for partition [topic1-example,34] to broker 1011:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition. (kafka.server.ReplicaFetcherThread)
we check the connectivity's between the kafka brokers and connectivity seems to be ok ( we verify the /var/log/messages and dmesg on the Linux kafka machines )
we are also suspect is the connections between the zookeeper client on kafka brokers to the zookeepers servers
but we not know how to check the relationship between client on kafka to zookeeper servers
we also know that kafka send heartbeat to zookeeper servers ( I think the heartbeat value is 2 seconds ) but we not sure if this is the right direction to search what cause the leader to disappears
any ideas what are the reasons that - kafka broker isn't the leader for topic partition ?
other related links
kafka : one broker keeping print INFO log : "NOT_LEADER_FOR_PARTITION"