0

As spark provides the feature of job failure detection and retries mechanism, I want to use this feature for my code, which means for some scenarios I want to mark spark Job as failed and then spark will retry that job again. I tried to find out the Job/Stage status, I found this. And implemented:

JavaSparkContext jsc = JavaSparkContext.fromSparkContext(this.spark.sparkContext());
        JavaSparkStatusTracker statusTracker = jsc.statusTracker();
        for(int jobId: statusTracker.getActiveJobIds()) {
             SparkJobInfo jobInfo = statusTracker.getJobInfo(jobId);
             for(int stageId: jobInfo.stageIds()) {
                 SparkStageInfo stageInfo = statusTracker.getStageInfo(stageId);
                 LOGGER.warn("Stage id=" + stageId + "; name = " + stageInfo.name()
                    + "; completed tasks:" + stageInfo.numCompletedTasks()
                    + "; active tasks: " + stageInfo.numActiveTasks()
                    + "; all tasks: " + stageInfo.numTasks()
                    + "; submission time: " + stageInfo.submissionTime());
            }
        }

here, I am able to find out job/task/stage status but is there any way(API) to mark spark job as failed? Also, Is it a good approach to implement a retry mechanism instead of writing custom code for retry?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Ketan Kumbhar
  • 85
  • 1
  • 1
  • 8

1 Answers1

0

In other words you want to re-schedule already successfully completed job, right?

While SparkContext provides API to cancel job by ID there's no similar API to re-run it.

However, it provides two low-level APIs to execute a custom job:

For both you need to provide RDD you want to re-process and a function to process partition.

But the easier option would be just to re-run your action that started the job again if you know which one you need to.

Yauhen
  • 131
  • 1
  • 5