0

I am trying to implement the GitHub project (https://github.com/tomatoTomahto/CDH-Sensor-Analytics) on our internal Hadoop cluster via Cloudera Data Science Workbench.

On running the project on Cloudera Data Science Workbench, I get the error "No Brokers available" when trying to connect to Kafka through Python api KafkaProducer(bootstrap_servers='broker1:9092') [Code can be found in https://github.com/tomatoTomahto/CDH-Sensor-Analytics/blob/master/datagenerator/KafkaConnection.py].

I have authenticated using Kerberos. I have tried giving broker node without port number, and also as a list. But, nothing has worked so far.

Below is the stack trace.

NoBrokersAvailable: NoBrokersAvailable
NoBrokersAvailable                        Traceback (most recent call 
last)
in engine
----> 1 dgen = DataGenerator(config)

/home/cdsw/datagenerator/DataGenerator.py in __init__(self, config)
 39         
 40         self._kudu = KuduConnection(self._config['kudu_master'], 
self._config['kudu_port'], spark)
---> 41         self._kafka = 
KafkaConnection(self._config['kafka_brokers'], 
self._config['kafka_topic'])
 42 
 43         #self._kafka

/home/cdsw/datagenerator/KafkaConnection.py in __init__(self, brokers, 
topic)
  4 class KafkaConnection():
  5   def __init__(self, brokers, topic):
----> 6     self._kafka_producer = 
KafkaProducer(bootstrap_servers=brokers)
  7     self._topic = topic
  8     

/home/cdsw/.local/lib/python3.6/site-packages/kafka/producer/kafka.py 
in __init__(self, **configs)
333 
334         client = KafkaClient(metrics=self._metrics, 
metric_group_prefix='producer',
--> 335                              **self.config)
336 
337         # Get auto-discovered version from client if necessary

/home/cdsw/.local/lib/python3.6/site-packages/kafka/client_async.py in 
__init__(self, **configs)
208         if self.config['api_version'] is None:
209             check_timeout = 
self.config['api_version_auto_timeout_ms'] / 1000
--> 210             self.config['api_version'] = 
self.check_version(timeout=check_timeout)
211 
212     def _bootstrap(self, hosts):

/home/cdsw/.local/lib/python3.6/site-packages/kafka/client_async.py in 
check_version(self, node_id, timeout, strict)
806             try_node = node_id or self.least_loaded_node()
807             if try_node is None:
--> 808                 raise Errors.NoBrokersAvailable()
809             self._maybe_connect(try_node)
810             conn = self._conns[try_node]

NoBrokersAvailable: NoBrokersAvailable

I also tried connecting outside of workbench through CLI by having VPN connection. I got the same error.

Any pointers on what am I missing? Thanks in advance!

Robin Moffatt
  • 30,382
  • 3
  • 65
  • 92
Sameer
  • 101
  • 2
  • 11

1 Answers1

2

The first step is establishing whether the network route is open and the broker is up and listening on that port. After that you can check authentication, etc.

Did you try telnet <broker host> 9092?

You may need to explicitly set advertised.listeners in addition to listeners, I've occasionally seen a weird quirk with Java where it wasn't binding to the expected network interface (or at least the one I expected!) and I had to force it using advertised.listeners.

Jeff Widman
  • 22,014
  • 12
  • 72
  • 88
  • I had the same error in a slightly different conext - using the kafka-python libraries - updating the server.properties from listeners to advertised.listeners worked. This article give details on the differences https://rmoff.net/2018/08/02/kafka-listeners-explained/ – Mark Parris May 02 '20 at 22:38