3

I have a list of string, but i cant find a way to change the list to a DStream of spark streaming. I tried this:

val tmpList = List("hi", "hello")    
val rdd = sqlContext.sparkContext.parallelize(Seq(tmpList))   
val rowRdd = rdd.map(v => Row(v: _*))

But the eclipse says sparkContext is not a member of sqlContext, so, How can i do this? Appreciate your help, Please.

Shaido
  • 27,497
  • 23
  • 70
  • 73
pauly
  • 51
  • 3

1 Answers1

3

DStream is the sequence of RDD and it is created when you have register a received to some streaming source like Kafka. For testing if you want to create DStream from list of RDD's you can do that as follows:

val rdd1 = sqlContext.sparkContext.parallelize(Seq(tmpList))
val rdd2 = sqlContext.sparkContext.parallelize(Seq(tmpList1))
ssc.queueStream[String](mutable.Queue(rdd1,rdd2))

Hope it answers your question.

Shaido
  • 27,497
  • 23
  • 70
  • 73
Sachin Janani
  • 1,310
  • 1
  • 17
  • 33
  • Thanks for your answer, i am pretty new with spark, i do not really understand you answer. You said: create DStream from list of RDD. But how can i get a list of RDD with a list of String ,cause i am not sure the code i have writen in the question is write. – pauly Oct 27 '16 at 06:11
  • Thank you, i rewrite the code:val sparkContext = new SparkContext(sparkConf) val rdd = sparkContext.parallelize(coutList) val resultInputStream = ssc.queueStream(scala.collection.mutable.Queue(rdd)) val results = resultInputStream.map(x=>x), Does the sqlContext is an object of class org.apache.spark.sql.SQLContext? and whether the code i have written is write? – pauly Oct 28 '16 at 02:50
  • hi, cause the ssc is an object of StreamingContext :val ssc = new StreamingContext(sparkConf, Seconds(10)), so when i add a val sparkContext = new SparkContext(sparkConf), there is a SparkException says only one SparkContext may be running in this JVM, so there maybe a interrupt between sparkContext and ssc, do you know why? – pauly Oct 28 '16 at 12:23