1

I am trying to move from spark-streaming-kafka-0.8 to spark-streaming-kafka-0.10 and I faced the following error: KafkaConsumer is not safe for multi-threaded access

We have multiple kafka clusters in different DCs that I want to consume simultaneously in Scala Spark Streaming application. In version 0.8 it has worked properly - we've just called createDirectStream multiple times, once for each cluster. But after upgrading to 0.10 it stopped working.

The only relevant answer I found is here: KafkaConsumer is not safe for multi-threaded access from SparkStreaming , but it is related to consuming multiple topics from the same cluster. Currently it is impossible to specify multiple clusters in a single call to createDirectStream and at the same time calling it multiple times leads to an error.

My question is: is there any way to consume data with spark-streaming-kafka-0.10 from multiple clusters?

Ruslan Ostafiichuk
  • 4,422
  • 6
  • 30
  • 35

0 Answers0