I am learning Kafka Producer API, and inside tutorials, they have mentioned "bootstrap.servers" is mandatory property to specify currently running brokers(by comma-separated value). I doubted why the producer provides all the brokers' list why not the producer provide zookeeper address and port and zookeeper will identify the broker.
Asked
Active
Viewed 289 times
3 Answers
1
Below are some reasons:
- Increased Latency : Using zookeepers to talk to clients for read/write operations will cause additional latency, since zookeepers will act as mediators between clients and brokers
- Increased Resource Requirements : In Kafka, Brokers do the heavy computations for management of topics data, if zookeepers acts as mediators, they will have to deal with heavy inflow and outflow of data resulting in an increase in resource (CPU/Memory) requirements for zookeepers.

ankit deora
- 79
- 1
- 7
-
I'm not sure about either of these reasons. For example, can you prove Pulsar is slower than Kafka by nature of its tiered architecture? – OneCricketeer Jan 23 '20 at 14:48
0
As of Kafka 0.9, Zookeeper is no longer required for client connections.The property used to be called zookeeper.connect
and did serve the purpose of what you say - finding the brokers
The list of brokers is stored/returned by the Kafka server designated as the Controller, the list is not returned to clients by Zookeeper
The eventual goal is to remove Zookeeper from the architecture

OneCricketeer
- 179,855
- 19
- 132
- 245
-
Are you sure about part "list is not stored in Zookeeper,"? When you browse the ZooKeeper using CLI you can see there a list of brokers (Kafka ~2.3). What is more ZooKeeper helps to choose the Controller, so it needs to know the brokers. – mmatloka Jan 23 '20 at 14:06
-
/brokers/ids is stored in Zookeeper, yes. The actual leader election RPC calls go through the brokers after quorum is reached by Zookeeper on a controller – OneCricketeer Jan 23 '20 at 14:46
0
Producer does not need to define all of the brokers. It is needed to provide only the subset and later the client will get the whole cluster topology from one of the servers (the active controller).

OneCricketeer
- 179,855
- 19
- 132
- 245

mmatloka
- 1,986
- 1
- 20
- 46