I've a Cloudera cluster with a clusterized Kafka service. I've two instances of Kafka controllers, lets say C1 and C2.
When C1 is the active controller everything seems to work fine. When for some reason, C2 becomes the active controller. Some of the sent messages sent via kafka-console-producer are not received by a kafka-console-consumer (exactly a half of the messages sent are not received by the consumer, one every two messages). I'm not sure if this is due to the change of active controller and some partition cannot be read.
Also, I've another issue if I shutdown C1 and only keep C2 up. If I try to start a previously working Streamsets flow reading from kafka, I get the error message "Cannot retrieve metadata for topic XXXX", It seems the topic metadata is only present in C1, which is offline in this scenario.
If I open a kafka-console-consumer after shutting down the first broker, I get the following exception:
WARN [console-consumer-16627_node10.agatha-cluster-1515508696963-2e45e6d8-leader-finder-thread]:
Failed to find leader for Set(testD-1, testD-0)
(kafka.consumer.ConsumerFetcherManager$LeaderFinderThread)
kafka.common.KafkaException: fetching topic metadata for topics [Set(testD)]
from broker [ArrayBuffer(BrokerEndPoint(183,110.250.17.242,9092))] failed
What I'm doing wrong when trying to use kafka with several brokers?