I am currently exploring spark's speculative tasks option.
Below are my configuration which I am planning to use.
I am reading the data from kafka and using repartition()
I am creating around 200+ tasks in my streaming code.
.set("spark.speculation", "true")
.set("spark.speculation.interval", "1000")
.set("spark.speculation.multiplier", "2")
.set("spark.speculation.quantile", "0.75")
Will the above configuration on speculative task have any impact on the overall performance of my streaming job ? If so are there any best practices in using spark's speculative tasks option ?