I want to add a security measure to my spark job, if they don't finish after X hours kill them selves. (using spark 2.4.3 in cluster mode in yarn mode)
Didn't find any configuration in spark that helps me with what I wanted .
I tried to do it this way:
val task = new java.util.TimerTask {
def run():Unit = {
val p = Runtime.getRuntime.exec(Array[String]("/bin/sh", "-c", s"yarn application -kill ${sc.applicationId}")) // this code can only run on the cluster
p.waitFor()
sc.stop()
}
}
timeoutProcess.schedule(task, X) // where X is 10000 for 10s for testing
But doesn't seem to accomplish killing the application, would appreciate any idea or thought regarding this.
Tried to look around but didnt find any good idea.