0

I'm writing a program for data ingestion. Read from Kafka to DStream split the Dstrem to 3 streams and executing Actions on each one:

val stream = createSparkStream(Globals.configs, ssc)
val s1 = stream.filter(<predicat1>)
val s2 = stream.filter(<predicat2>)
val s3 = stream.filter(<predicat3>)

//I'm looking for something like:
s1.forEachRddAsync(...
s2.forEachRddAsync(...
s3.forEachRddAsync(... 

If it possible to trigger async submit on whole DStream and not RDD.

Alex
  • 111
  • 10

1 Answers1

0

DStream action methods, while indeed blocking, don't process data. These only register DStream as an output stream.

Once the StreamingContext is started, the processing will be scheduled according to the available resources, and if these allow, be processed without limiting each other.