0

I'm new to Spring Batch. I have a scheduled job which needs to run every 2 hours. This job has several multi-threaded steps which should run independent from each other. The job is currently launched using a JobLauncher, as mentioned below.

@Component
@EnableScheduling
public class JobScheduler {

    private static final Logger logger = LoggerFactory.getLogger(JobScheduler.class);

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job job;

    @Scheduled(cron = "0 0 */2 * * ?")
    @Retryable(maxAttempts = 3, backoff = @Backoff(delay = 60000),
            include = {SQLException.class, RuntimeException.class})
    public void automatedTask() {

        JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis()).toJobParameters();

        try {

            JobExecution jobExecution = jobLauncher.run(job, jobParameters);

        } catch (JobInstanceAlreadyCompleteException | JobRestartException | JobParametersInvalidException |
                 JobExecutionAlreadyRunningException ex) {
            logger.error("Error occurred when executing job scheduler", ex);
        }

    }

}

Mentioned below is my BatchConfig class.

@Configuration
@EnableBatchProcessing
@EnableRetry
public class BatchConfig {

    @Autowired
    private JobBuilderFactory jobBuilderFactory;

    @Autowired
    private StepBuilderFactory stepBuilderFactory;

    @Autowired
    private DataSource dataSource;

    @Bean
    @StepScope
    public JdbcPagingItemReader<Model> reader1() {
        StringBuffer selectClause = new StringBuffer();
        selectClause.append("SELECT ");
        selectClause.append("* ");

        StringBuffer fromClause = new StringBuffer();
        fromClause.append("FROM ");
        fromClause.append("TABLENAME");

        OraclePagingQueryProvider oraclePagingQueryProvider = new OraclePagingQueryProvider();
        oraclePagingQueryProvider.setSelectClause(selectClause.toString());
        oraclePagingQueryProvider.setFromClause(fromClause.toString());
        Map<String, Order> orderByKeys = new HashMap<>();
        orderByKeys.put("id", Order.ASCENDING);
        oraclePagingQueryProvider.setSortKeys(orderByKeys);

        JdbcPagingItemReader<Model> jdbcPagingItemReader = new JdbcPagingItemReader<>();
        jdbcPagingItemReader.setSaveState(false);
        jdbcPagingItemReader.setDataSource(dataSource);
        jdbcPagingItemReader.setQueryProvider(oraclePagingQueryProvider);
        jdbcPagingItemReader.setRowMapper(BeanPropertyRowMapper.newInstance(Model.class));
        return jdbcPagingItemReader;
    }

    @Bean
    @StepScope
    public JdbcPagingItemReader<Model> reader2() {
    }

    @Bean
    @StepScope
    public JdbcPagingItemReader<Model> reader3() {
    }

    @Bean
    @StepScope
    public ItemWriter<Model> writer1() {
        return new CustomItemWriter1();
    }

    @Bean
    @StepScope
    public ItemWriter<Model> writer2() {
        return new CustomItemWriter2();
    }

    @Bean
    @StepScope
    public ItemWriter<Model> writer3() {
        return new CustomItemWriter3();
    }

    @Bean
    public Step step1() {
        ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
        taskExecutor.setCorePoolSize(4);
        taskExecutor.setMaxPoolSize(4);
        taskExecutor.afterPropertiesSet();

        return stepBuilderFactory.get("step1")
                .<Model, Model>chunk(1000)
                .reader(reader1())
                .writer(writer1())
                .faultTolerant()
                .skipPolicy(new AlwaysSkipItemSkipPolicy())
                .skip(Exception.class)
                .listener(new CustomSkipListener())
                .taskExecutor(taskExecutor)
                .build();
    }

    @Bean
    public Step step2() {
    }

    @Bean
    public Step step3() {
    }

    @Bean
    public Job myJob() {
        return jobBuilderFactory.get("myJob").incrementer(new RunIdIncrementer())
//                .listener(new CustomJobExecutionListener())
                .start(step1()).on("*").to(step2())
                .from(step1()).on(ExitStatus.FAILED.getExitCode()).to(step2())
                .from(step2()).on("*").to(step3())
                .from(step2()).on(ExitStatus.FAILED.getExitCode()).to(step3())
                .end().build();
    }

}

I've added conditional flow to the job so that every next step should work regardless of a failure in the previous step. Everything works fine in the initial steps but if an exception is thrown in the last step, the Exit Status of the whole job becomes FAILED. To solve this AND to solve any other failures in the job, I tried to implement restart functionality. Please note that I'm not saving the state in the readers due to multi-threading and I'm not sure whether this could affect the restarting.

I have referred the accepted solution in the below question,

https://stackoverflow.com/questions/38846457/how-can-you-restart-a-failed-spring-batch-job-and-let-it-pick-up-where-it-left-o

but I don't quite understand how or where to call the jobOperator.restart method at.

I've tried it like below, expecting the job to restart after launching, if failed. But it didn't work at all. Also, this implementation would stop the functionality of @Retryable annotation due to the try-catch block with Exception class caught.

@Component
@EnableScheduling
public class JobScheduler {

    private static final Logger logger = LoggerFactory.getLogger(JobScheduler.class);

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private Job job;

    @Autowired
    private JobRepository jobRepository;

    @Autowired
    private JobRegistry jobRegistry;

    @Autowired
    private DataSource dataSource;

    @Scheduled(cron = "0 0 */2 * * ?")
    @Retryable(maxAttempts = 3, backoff = @Backoff(delay = 60000),
            include = {SQLException.class, RuntimeException.class})
    public void automatedTask() {

        JobParameters jobParameters = new JobParametersBuilder().addLong("time", System.currentTimeMillis()).toJobParameters();

        try {

            JobExecution jobExecution = jobLauncher.run(job, jobParameters);

            JobExplorer jobExplorer = this.getJobExplorer(dataSource);
            JobOperator jobOperator = this.getJobOperator(jobLauncher, jobRepository, jobRegistry, jobExplorer);

            List<JobInstance> jobInstances = jobExplorer.getJobInstances("myJob",0,1);

            if(!jobInstances.isEmpty()){
                JobInstance jobInstance =  jobInstances.get(0);
                List<JobExecution> jobExecutions = jobExplorer.getJobExecutions(jobInstance);
                if(!jobExecutions.isEmpty()){
                    for(JobExecution execution: jobExecutions){
                        if(execution.getStatus().equals(BatchStatus.FAILED)){
                            jobOperator.restart(execution.getId());
                        }
                    }
                }
            }

        } catch (Exception  ex) {
            logger.error("Error occurred when executing job scheduler", ex);
        }

    }

    @Bean
    public JobOperator getJobOperator(final JobLauncher jobLauncher, final JobRepository jobRepository,
                                   final JobRegistry jobRegistry, final JobExplorer jobExplorer) {
        final SimpleJobOperator jobOperator = new SimpleJobOperator();
        jobOperator.setJobLauncher(jobLauncher);
        jobOperator.setJobRepository(jobRepository);
        jobOperator.setJobRegistry(jobRegistry);
        jobOperator.setJobExplorer(jobExplorer);
        return jobOperator;
    }

    @Bean
    public JobExplorer getJobExplorer(final DataSource dataSource) throws Exception {
        final JobExplorerFactoryBean bean = new JobExplorerFactoryBean();
        bean.setDataSource(dataSource);
        bean.setTablePrefix("BATCH_");
        bean.setJdbcOperations(new JdbcTemplate(dataSource));
        bean.afterPropertiesSet();
        return bean.getObject();
    }

}

I then tried adding a custom JobExecutionListener like below, expecting it to restart it after running the job, if failed. But it just fails as all the Autowired beans are becoming NULL.

public class CustomJobExecutionListener {

    private static final Logger logger = LoggerFactory.getLogger(CustomJobExecutionListener.class);

    @Autowired
    private JobLauncher jobLauncher;

    @Autowired
    private JobRepository jobRepository;

    @Autowired
    private JobRegistry jobRegistry;

    @Autowired
    private DataSource dataSource;

    @BeforeJob
    public void beforeJob(JobExecution jobExecution) {

    }

    @AfterJob
    public void afterJob(JobExecution jobExecution) {

        try {
            JobExplorer jobExplorer = this.getJobExplorer(dataSource);
            JobOperator jobOperator = this.getJobOperator(jobLauncher, jobRepository, jobRegistry, jobExplorer);

            if(jobExecution.getStatus().equals(BatchStatus.FAILED)){
                jobOperator.restart(jobExecution.getId());
            }
        } catch (Exception ex) {
            logger.error("Unknown error occurred when executing after job execution listener", ex);
        }

    }

    @Bean
    public JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor(JobRegistry jobRegistry) {
        final JobRegistryBeanPostProcessor jobRegistryBeanPostProcessor = new JobRegistryBeanPostProcessor();
        jobRegistryBeanPostProcessor.setJobRegistry(jobRegistry);
        return jobRegistryBeanPostProcessor;
    }

    @Bean
    public JobOperator getJobOperator(final JobLauncher jobLauncher, final JobRepository jobRepository,
                                   final JobRegistry jobRegistry, final JobExplorer jobExplorer) {
        final SimpleJobOperator jobOperator = new SimpleJobOperator();
        jobOperator.setJobLauncher(jobLauncher);
        jobOperator.setJobRepository(jobRepository);
        jobOperator.setJobRegistry(jobRegistry);
        jobOperator.setJobExplorer(jobExplorer);
        return jobOperator;
    }

    @Bean
    public JobExplorer getJobExplorer(final DataSource dataSource) throws Exception {
        final JobExplorerFactoryBean bean = new JobExplorerFactoryBean();
        bean.setDataSource(dataSource);
        bean.setTablePrefix("BATCH_");
        bean.setJdbcOperations(new JdbcTemplate(dataSource));
        bean.afterPropertiesSet();
        return bean.getObject();
    }

}

What am I doing wrong? How should the restart functionality be implemented for this job?

Appreciate your kind help!

EGE
  • 11
  • 2

1 Answers1

0

Please note that I'm not saving the state in the readers due to multi-threading and I'm not sure whether this could affect the restarting.

It certainly affects restartability. Multi-threading in steps is incompatible with restartability. From the javadoc of the JdbcPagingItemReader that you are using, you can read the following:

The implementation is thread-safe in between calls to open(ExecutionContext),
but remember to use saveState=false if used in a multi-threaded client
(no restart available).

Without restart data, Spring Batch cannot restart the step from where it left-off. This is a trade-off that you have accepted by using a multi-threaded step.

but I don't quite understand how or where to call the jobOperator.restart method at.

Now with regard to restarting the failed job, a few notes:

  • Trying to restart a job in a JobExecutionListener is incorrect. This listener is called in the scope of the current job execution, while a restart will have its own, distinct job execution
  • JobOperator#restart should not be called inside the scheduled method, otherwise it will be called for every scheduled run. You can find an example here: https://stackoverflow.com/a/55137314/5019386
Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
  • I checked the link for the example. I have already used `@Retryable` annotation in the `@Scheduled` method. What I understood is, it is like `RetryTemplate` without customization. The current implementation of `@Retryable` works properly when the configured exceptions are thrown in the beginning and the job fails to start at all. But it does not retry the failed job whenever an exception is thrown in the **last step** of the job, which was my original issue. Adding the same `@Retryable` to the last step specifically didn't work either. Could this have something to do with the conditional flow? – EGE Nov 11 '22 at 10:16
  • I didn't understand the above issue yet. Could you please help? – EGE Nov 14 '22 at 03:31
  • What is the difference between a failure in the first step or the last step? I'm not sure this is the cause of your issue. As far as the scheduled method is concerned, the job has either succeeded or failed, not matter in which step the failure happens. As mentioned in the answer `JobOperator#restart` should not be called inside the scheduled method. You should restart failed jobs in a separate process. – Mahmoud Ben Hassine Nov 15 '22 at 08:37
  • Ohh, okay. I'll try on this and update if I face any issues. Thank you so much for the explanation so far! – EGE Nov 15 '22 at 12:07