In Streaming have set these parameters as below spark.worker.cleanup.enabled true spark.worker.cleanup.interval 60 spark.worker.cleanup.appDataTtl 90
This clears out already killed spark batch/streaming jobs data in work/app-2016*/(1,2,3,4,5,6,...) folders. But on running Spark Streaming job the history data in the current app-* is not deleted. Since we are using Kafka-Spark connector jar,for every micro batch it copies this jar with app jar and stderr,stdout results on each folders(work/app-2016*/(1,2,3,4,5,6,...) . This itself is eating up lot of memory as Kafka-Spark connector is an uber jar and is around 15 MB and in a day it coming to 100 GB .
Is there a way to delete data from current running Spark Streaming job or we should do some scripting for that...?