0

I'm trying to get a simple DStream to print but with no success. See code below. I'm using a Databricks notebook in Azure.

import org.apache.spark.streaming.{ StreamingContext, Seconds }
val ssc = new StreamingContext(sc, batchDuration = Seconds(5))

ssc.checkpoint(".")

val rdd = sc.parallelize(0 to 3)
import org.apache.spark.streaming.dstream.ConstantInputDStream
val stream = new ConstantInputDStream(ssc, rdd)

println("start")

stream.print()

ssc.start()

The output is:

start

warning: there was one feature warning; re-run with -feature for details
import org.apache.spark.streaming.{StreamingContext, Seconds}
ssc: org.apache.spark.streaming.StreamingContext = org.apache.spark.streaming.StreamingContext@4d01c7b1
rdd: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[1] at map at command-3696830887613521:7
import org.apache.spark.streaming.dstream.ConstantInputDStream
stream: org.apache.spark.streaming.dstream.ConstantInputDStream[Int] = org.apache.spark.streaming.dstream.ConstantInputDStream@12b9db22

I'm expecting to see 0,1,2 in one way or another.

I've also tried adding

ssc.awaitTermination()

but it never finishes. See screenshot: enter image description here

Koenig Lear
  • 2,366
  • 1
  • 14
  • 29
  • It should work. Can you include a screenshot of the notebook? It's possible that the output of `print` goes to the standard output that may not be printed out in the notebook? Mostly guessing. – Jacek Laskowski Jul 14 '20 at 17:15
  • Out of curiousity, any reasons to use DStream API ([Spark Streaming](http://spark.apache.org/docs/latest/streaming-programming-guide.html)), and not DataFrame API ([Spark Structured Streaming](http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html))? – Jacek Laskowski Jul 14 '20 at 17:17
  • @JacekLaskowski Yes the idea that goes outside of the notebook makes sense, but I'd love to know how to get it back in the notebook. – Koenig Lear Jul 14 '20 at 17:36
  • @JacekLaskowski why i'm using DSTream, because I need to use mapWithState which is not available in structured streaming (mapGroupwithState is not useful). – Koenig Lear Jul 14 '20 at 17:37

0 Answers0