Spark Streaming integration for Kafka. Direct Stream approach provides simple parallelism, 1:1 correspondence between Kafka partitions and Spark partitions, and access to offsets and metadata.
Questions tagged [spark-streaming-kafka]
250 questions
0
votes
1 answer
Kafka spark streaming integration
I have setup Kafka and spark streaming using maven in my system. I would like to know any suggestions that could help me do wider operations apart from typing something in the producer and seeing it in the consumers .
How can I create a source that…
user3752667
0
votes
1 answer
spark streaming + kafka compatibility issue
Will spark streaming compatible with kafka versions above 0.8.2.1?
Is writing custom receiver the only option to make spark streaming use kafka version above 0.9?

pavan k
- 11
- 1
-1
votes
1 answer
import org.apache.spark.streaming.kafka._ Cannot resolve symbol kafka
I have created one spark application to integrate with kafka and get stream of data from kafka.
But, when i try to import import org.apache.spark.streaming.kafka._ an error occur that Cannot resolve symbol kafka so what should i do to import this…

Vishal Barvaliya
- 197
- 1
- 3
- 14
-1
votes
1 answer
repartition in spark streaming is taking more time?
I am running a spark application where data comes in every 1 minute. No of repartitions i am doing is 48. It is running on 12 executor with 4G as executor memory and executor-cores=4.
Below are the streaming batches processing time
Here we can see…

Deepank Porwal
- 73
- 1
- 8
-1
votes
1 answer
overloaded method value createDirectStream with alternatives
My spark version is 1.6.2, And My kafka version is 0.10.1.0. And I want to send a custom object as the kafka value type and I try to push this custom object into the kafka topic. And use spark streaming to read the data. And I'm using Direct…

ZHEN BIAN
- 13
- 1
- 6
-1
votes
1 answer
Spark Structured Streaming - Streaming data joined with static data which will be refreshed every 5 mins
For spark structured streaming job one input is coming from a kafka topic while second input is a file (which will be refreshed every 5 mins by a python API). I need to join these 2 inputs and write to a kafka topic.
The issue I am facing is when…

pradeep reddy b
- 3
- 1
- 2
-1
votes
1 answer
Failed to find leader for topics; java.lang.NullPointerException NullPointerException at org.apache.kafka.common.utils.Utils.formatAddress
When we are trying to stream the data from SSL enabled Kafka topic we are facing below error . Can you please help us on this issue .
19/11/07 13:26:54 INFO ConsumerFetcherManager: [ConsumerFetcherManager-1573151189884] Added fetcher for partitions…

Karthikeyan Rasipalayam Durai
- 109
- 4
- 10
-1
votes
1 answer
DStreams: Variable created within foreachRDD and then modified inside foreachPartition is reset once outside of foreachPartition?
I have a bunch of messages in kafka and using spark streaming to process those messages.
I am trying to catch when my code fails to insert to my DB, and then take those messages and insert them back into Kafka so I can process them later.
To…

alex
- 1,905
- 26
- 51
-2
votes
2 answers
Kafka Spark Streaming
I was trying to build Kafka and spark streaming use case. In that, Spark Streaming is consuming streaming from Kafka. And we are enhancing stream and storing enhanced stream into some target system.
My question here is that does it make sense to run…

Ankit Tripathi
- 325
- 2
- 12
-3
votes
1 answer
Creating an RDD from ConsumerRecord Value in Spark Streaming
I am trying to create a XmlRelation based on ConsumerRecord Value.
val value = record.value();
logger.info(".processRecord() : Value ={}" , value)
if(value !=null) {
val rdd = spark.sparkContext.parallelize(List(new…

Sateesh K
- 1,071
- 3
- 19
- 45