5

I have a spring batch job which reads, transforms and writes to an Oracle database. I am running the job via the CommandLineJobRunner utility (using a fat jar + dependencies generated with the maven shade plugin); the job subsequently fails halfway through due to "java heap memory limit reached" and the job is not marked as FAILED but rather still shows status STARTED.

I tried to re-run the job using the same job parameters (as the docs suggest) but this gives me this error:

5:24:34.147 [main] ERROR o.s.b.c.l.s.CommandLineJobRunner - Job Terminated in error: A job execution for this job is already running: JobInstance: id=1, version=0, Job=[maskTableJob]

org.springframework.batch.core.repository.JobExecutionAlreadyRunningException: A job execution for this job is already running: JobInstance: id=1, version=0, Job=[maskTableJob] at org.springframework.batch.core.repository.support.SimpleJobRepository.createJobExecution(SimpleJobRepository.java:120) ~[maskng-batch-1.0-SNAPSHOT-executable.jar:1.0-SNAPSH

I have tried all sorts of things (like manually setting the status to FAILED, using the -restart argument) but to no avail. Is there something I am missing here as I thought one of the strong points of spring batch is its ability to restart jobs where they left off....!!?

  • Rerunning the job wont work with the same parameters. Add the current time as a parameter to rerun. See http://www.mkyong.com/spring-batch/spring-batch-a-job-instance-already-exists-and-is-complete-for-parameters/. Also checkout the documentation: http://docs.spring.io/spring-batch/reference/html/configureJob.html#d5e1320 – Sander_M Aug 03 '16 at 14:49
  • Yeah I know that BUT I need to re-run a particular job as I want it to carry on where it left off. My job processes terabytes of rows and can take up to days to run so I do want to restart the damn thing from the start every time :-) – Christopher Richard Dobbs Aug 03 '16 at 15:17
  • I thought the chunk processing in steps is supposed to take care of such things – Christopher Richard Dobbs Aug 03 '16 at 15:17
  • Aha...I found the problem ! I had switched the jobRepository to use another DB but this seems to have caused Spring some confusion so I dropped everything and started from scratch and this now works as expected – Christopher Richard Dobbs Aug 04 '16 at 07:58
  • @ChristopherRichardDobbs Read my answer , let me know if you want anything else . – DevG Aug 04 '16 at 08:52

1 Answers1

6

First thing that you should know is Joblauncher cannot be used to restart the job which has already run . The reason why you are getting "JobExecutionAlreadyRunningException" is because the parameter that you are passing is already present in the DB and hence you are getting this exception .

In spring batch , job can be restarted if it has completed with "FAILED" status or "STOPPED" status.

JobOperator has restart method which can be used to restart a failed job by passing the jobexecution id which was completed with "FAILED" status or "STOPPED" status.

Please note that a job cannot be restarted if it has completed with "FINISHED" status . In this case you will have to submit new job with new job parameters

If you want to manually set the status of job as failed then run the below query and restart the job using JobOperator.restart() method.

update batch_job_execution set status="FAILED", version=version+1 where job_instance_id=jobId;

Improper handling of transaction management could be one possible reason why your job status is not getting updated with the "FAILED" status . Please make sure you are transaction is getting completed even if the job has encountered run time exception.

lebowski
  • 1,031
  • 2
  • 20
  • 37
DevG
  • 474
  • 4
  • 16
  • 1
    Many thanks for that reply. I notice that if I have a job which finished with status=FAILED and exitstatus=UNKNOWN, I can re-run the job with same job params but it seems to not continue where it left off rather start a whole new job run. This was using the commandLineJobRunner so I will investigate using jobOperator from a java main and manually set the exit status as you say! – Christopher Richard Dobbs Aug 04 '16 at 12:55
  • You are welcome , if you think my answer has solved your problem then please accept it as a correct answer by clicking on the tick mark :). – DevG Aug 04 '16 at 12:58
  • I knocked up some java to restart the job with same job params via jobOperator and it appears to restart ok. I even see "resuming job" in log and thought everything working great...however it looks like I have a new execution id and the job step has started from the beginning. My target database shows it contains more rows than the source which proves it did not continue. – Christopher Richard Dobbs Aug 04 '16 at 13:30
  • Another indicator is that the beforeJob() listener I have configured fires even when job is restarted - is this correct behaviour as I thought that should only fire on a brand new job !?? – Christopher Richard Dobbs Aug 04 '16 at 13:33
  • I am such an idiot...RTFM as usual....I set savestate=false in my jdbcItemReader....so it was brain dead for a restart - many thanks for your comments as it homed me in the problem at last! – Christopher Richard Dobbs Aug 04 '16 at 13:38
  • One slight annoyance is that even though it is resuming , the BeforeJob() listener fires...is this correct behaviour as thought htat was just for new jobs? I am using this to clear out the target database but dont want to do that if its been restarted – Christopher Richard Dobbs Aug 04 '16 at 13:46
  • BeforeJob and afterJob listeners will be always invoked , even if job has failed in between steps . – DevG Aug 04 '16 at 13:50
  • I am glad that my answer was able to clear your doubt . – DevG Aug 04 '16 at 13:52