0

When I run spring batch jobs in parallel using 'Split flow', I found that there are differences between spring batch admin and command line.

I have a job flow as the following:

Job1 -> Job2 -> Job3  
     -> Job4

When I run these jobs from spring batch admin, 'Job1' can launch both 'Job2' and 'Job4', and then 'Job2' can launch 'Job3'.

Moreover, 'Job1' is completed after the 'step1' is finished. And then 'Job2' and 'Job4' continue their processes in parallel.
'Job1' does not wait for 'Job2' and 'Job4' to be finished.

app-context.xml and Job configuration for spring batch admin are as follows:
Job1.xml

<import resource="classpath*:META-INF/spring/batch/dependencies/parallel2.xml"/>
<import resource="classpath*:META-INF/spring/batch/dependencies/parallel3.xml"/>

<bean id="taskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor"/>

<bean id="simpleJobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository" />
    <property name="taskExecutor" ref="taskExecutor" />
</bean>

<bean id="job1.stp01" class="com.jobs.Job1Step1" />

<batch:job id="Job1" restartable="true" >
    <batch:step id="step1" next="split">
        <batch:tasklet ref="job1.stp01" />
    </batch:step>

    <batch:split id="split" next="step3">
        <batch:flow>
            <batch:step id="flow1" >
                <batch:job ref="Job2" job-launcher="simpleJobLauncher"/>
            </batch:step>
        </batch:flow>

        <batch:flow>
            <batch:step id="flow2">
                <batch:job ref="Job3" job-launcher="simpleJobLauncher"/>
            </batch:step>
        </batch:flow>
    </batch:split>          
</batch:job>

app-context.xml

<batch:job-repository id="jobRepository" />

<task:executor id="jobLauncherTaskExecutor" pool-size="10" rejection-policy="ABORT"/>

<bean id="jobRegistry" class="org.springframework.batch.core.configuration.support.MapJobRegistry" />

<bean class="org.springframework.batch.core.configuration.support.JobRegistryBeanPostProcessor">
    <property name="jobRegistry" ref="jobRegistry"/>
</bean>

But in command line, there are some differences from spring batch admin.
app-context.xml and job configuration for command line are as follows:
Job1.xml

<import resource="app-context.xml" />
<import resource="parallel2.xml"/>
<import resource="parallel3.xml"/>

<bean id="job1.stp01" class="com.jobs.Job1Step1" />

<batch:job id="Job1" restartable="true" >
    <batch:step id="step1" next="split">
        <batch:tasklet ref="job1.stp01" />
    </batch:step>

    <batch:split id="split" task-executor="jobLauncherTaskExecutor" next="step3">
        <batch:flow>
            <batch:step id="flow1" >
                <batch:job ref="Job2" job-launcher="simpleJobLauncher"/>
            </batch:step>
        </batch:flow>

        <batch:flow>
            <batch:step id="flow2">
                <batch:job ref="Job3" job-launcher="simpleJobLauncher"/>
            </batch:step>
        </batch:flow>
    </batch:split>      
</batch:job>

app-context.xml

<bean id="springBatchDataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
    .......
</bean>

<bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />

<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
    <property name="databaseType" value="POSTGRES" />
    <property name="dataSource" ref="springBatchDataSource" />
    <property name="transactionManager" ref="transactionManager" />
</bean>

<bean id="simpleJobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository" />
</bean>

<bean id="jobLauncherTaskExecutor" class="org.springframework.core.task.SimpleAsyncTaskExecutor" />

When I run jobs form command line, 'Job1' can launch both 'Job2' and 'Job4', and then 'Job2' can launch 'Job3'.

The problem is

Although 'step1' is finished, the status of 'Job1' is still 'Unknown'. The status changes into 'Completed' only after 'Job2' and 'Job4' are finished.
'Job1' is waiting for 'Job2' and 'Job4' to be finished.

But in spring batch admin, 'Job1' does not wait for 'Job2' and 'Job4'. The status of 'Job1' changes into 'Completed' as soon as 'step1' is finished.

I would not like 'Job1' to wait for 'Job2' and 'Job4' in command line.
Is there a way to do so???

Ps. I'm sorry because of long question and thanks for help.

Jeff Cook
  • 7,956
  • 36
  • 115
  • 186
nnt
  • 97
  • 1
  • 11

1 Answers1

0

The correct behavior for this scenario is that in both cases, Job1 shouldn't be marked as COMPLETED until Job2 and Job4 are complete. If that isn't happening in one of the scenarios, this is a bug and should be logged in Jira (https://jira.spring.io). Are you sure that the other jobs are not complete in Spring Batch Admin when Job1 flags as COMPLETE?

Michael Minella
  • 20,843
  • 4
  • 55
  • 67
  • I think I found the reason. It is because the above configurations of spring batch admin and command line are not same. In spring batch admin, I use `taskExecutor` in `simpleJobLauncher` to make 'Job2' and 'Job4' run in parallel. But in command line, although I use `taskExecutor` in `simpleJobLauncher`, the jobs do not run in parallel. They only execute sequentially. So, I use `taskExecutor` in `split flow` and the jobs run in parallel but the above case(as I explained in my question) is occurred. – nnt Aug 22 '14 at 03:30
  • (Cont'd from above comment.) When I test again spring batch admin by using `taskExecutor` in 'split flow`, it is same as command line. Job1 waits for the other jobs to be finished. So, I think I wrongly question and I'm sorry. Actually, the question I should ask is how to make jobs run concurrently in command line. Is there a way to run jobs in parallel in command line?? – nnt Aug 22 '14 at 03:31