When and how often does Kafka High Level Producer elect a leader? Does it do before sending each message or only once at the time of creating connection?
-
Producers don't elect any leader -- there is no leader concept for them. Can you elaborate what you mean? Brokers and consumer do have leader concept though. – Matthias J. Sax Mar 31 '17 at 22:48
-
Producer has to publish to a broker in cluster. So how does producer decide which instance of kafka it will push data to? – Marut Singh Apr 01 '17 at 02:06
-
2It depends to what topic and partition you write date to. Each topic-partition has one broker that is the leader for it -- and writes happen only to the leader. Thus, potentially, a producer might write to all brokers in your cluster if all partitions it writes to are hosted on different brokers. – Matthias J. Sax Apr 01 '17 at 23:10
-
1Thx Matthias.. That brings back my original question...when is it decided that certain broker is leader for a topic/partition.. Is it responsibility of producer... And does it get decided before each message is published or at the time of creating a connection? – Marut Singh Apr 02 '17 at 02:55
-
1On topic creation, brokers decide what broker will be the leader for each partition (it completely unrelate to the producer -- you might want to update your question accordingly). I am not familiar with the details though. --- "does it get decided before each message is published" -> this does not make sense -- the leader for a partition is fix and does not change (if you want to change the leader to need to issue a manual admit command to tell Kafka to move a partition from one broker to another) – Matthias J. Sax Apr 02 '17 at 03:31
1 Answers
Every broker has a information about the list of topics(and partitions) and their leaders which will be kept up to date by the zoo keeper whenever the new leader is elected or when the number of partition changes.
Thus, when the producer makes a call to one of the brokers, it responds with this information list. Once the producer receives this information, it caches this and uses it to connect to the leader. So next time when it wants to send the message to that particular topic (and partition) it will use this cached information.
Lets assume there was only one leader and there are no replicas for that topic/ partition duo and it got crushed. In this case it will try to connect to that leader and it fails. It will try to fetch the leader from the other brokers list which it has cached to check if there is any leader for this topic! As it does not find any, it will try to hit to the same leader(that is dead) and after reaching a maximum no of retries it will throw an exception !!

- 552
- 4
- 15