I am trying to implement the GitHub project (https://github.com/tomatoTomahto/CDH-Sensor-Analytics) on our internal Hadoop cluster via Cloudera Data Science Workbench.
On running the project on Cloudera Data Science Workbench, I get the error "No Brokers available" when trying to connect to Kafka through Python api KafkaProducer(bootstrap_servers='broker1:9092') [Code can be found in https://github.com/tomatoTomahto/CDH-Sensor-Analytics/blob/master/datagenerator/KafkaConnection.py].
I have authenticated using Kerberos. I have tried giving broker node without port number, and also as a list. But, nothing has worked so far.
Below is the stack trace.
NoBrokersAvailable: NoBrokersAvailable
NoBrokersAvailable Traceback (most recent call
last)
in engine
----> 1 dgen = DataGenerator(config)
/home/cdsw/datagenerator/DataGenerator.py in __init__(self, config)
39
40 self._kudu = KuduConnection(self._config['kudu_master'],
self._config['kudu_port'], spark)
---> 41 self._kafka =
KafkaConnection(self._config['kafka_brokers'],
self._config['kafka_topic'])
42
43 #self._kafka
/home/cdsw/datagenerator/KafkaConnection.py in __init__(self, brokers,
topic)
4 class KafkaConnection():
5 def __init__(self, brokers, topic):
----> 6 self._kafka_producer =
KafkaProducer(bootstrap_servers=brokers)
7 self._topic = topic
8
/home/cdsw/.local/lib/python3.6/site-packages/kafka/producer/kafka.py
in __init__(self, **configs)
333
334 client = KafkaClient(metrics=self._metrics,
metric_group_prefix='producer',
--> 335 **self.config)
336
337 # Get auto-discovered version from client if necessary
/home/cdsw/.local/lib/python3.6/site-packages/kafka/client_async.py in
__init__(self, **configs)
208 if self.config['api_version'] is None:
209 check_timeout =
self.config['api_version_auto_timeout_ms'] / 1000
--> 210 self.config['api_version'] =
self.check_version(timeout=check_timeout)
211
212 def _bootstrap(self, hosts):
/home/cdsw/.local/lib/python3.6/site-packages/kafka/client_async.py in
check_version(self, node_id, timeout, strict)
806 try_node = node_id or self.least_loaded_node()
807 if try_node is None:
--> 808 raise Errors.NoBrokersAvailable()
809 self._maybe_connect(try_node)
810 conn = self._conns[try_node]
NoBrokersAvailable: NoBrokersAvailable
I also tried connecting outside of workbench through CLI by having VPN connection. I got the same error.
Any pointers on what am I missing? Thanks in advance!