I'm trying to read data from a kafka stream, process it and save it to a report. I'd like to run this job once a day. I'm using dStreams. Is there an equivalent of trigger(Trigger.Once) in dStreams I could use for this scenario. appreciate suggestions and help.
def main(args: Array[String]) {
val spark = ...
val ssc = new StreamingContext(sc, Seconds(jobconfig.getLong("batchInterval")))
val kafkaStream =
KafkaUtils.createDirectStream[String, String](ssc, LocationStrategies.PreferConsistent, ConsumerStrategies.Subscribe[String, String](Array(jobconfig.getString("topic")), kafkaParams))
kafkaStream.foreachRDD(rdd => {
.
.
.
.
}
sqlContext.clearCache()
ssc.start()
ssc.awaitTermination()
}