0

I am having issues with restart of local partitioning batch. I am throwing RuntimeException on 101st processed item. The job fails, but something is going wrong, because on restart, the job continues from 150th item (and not from the 100th item that it should).

Here is the xml-conf:

<bean id="taskExecutor" class="org.springframework.scheduling.commonj.WorkManagerTaskExecutor" >
    <property name="workManagerName" value="springWorkManagers" />
</bean>

<bean id="transactionManager" class="org.springframework.transaction.jta.WebSphereUowTransactionManager"/>

<batch:job id="LocalPartitioningJob">
    <batch:step id="masterStep">
        <batch:partition step="slaveStep" partitioner="splitPartitioner">
            <batch:handler grid-size="5" task-executor="taskExecutor"  />
        </batch:partition>
    </batch:step>
</batch:job>

<batch:step id="slaveStep">
    <batch:tasklet transaction-manager="transactionManager">
        <batch:chunk reader="partitionReader" processor="compositeItemProcessor" writer="sqlWriter" commit-interval="50" />
        <batch:transaction-attributes isolation="SERIALIZABLE" propagation="REQUIRE" timeout="600" />
        <batch:listeners>
            <batch:listener ref="Processor1" /> 
            <batch:listener ref="Processor2" /> 
            <batch:listener ref="Processor3" />
        </batch:listeners>
    </batch:tasklet>
</batch:step>

<bean id="jobRepository" class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
    <property name="transactionManager" ref="transactionManager" />
    <property name="tablePrefix" value="${sb.db.tableprefix}" />
    <property name="dataSource" ref="ds" />
    <property name="maxVarCharLength" value="1000"/>
</bean>

<bean id="transactionManager" class="org.springframework.transaction.jta.WebSphereUowTransactionManager"/>

<jee:jndi-lookup id="ds" jndi-name="${sb.db.jndi}" cache="true" expected-type="javax.sql.DataSource" />

The splitPartitioner implements Partitioner and splits the initial data and saves it to the executionContexts as lists. The processors call remote ejb's to fetch additional data and the sqlWriter is just a org.spring...JdbcBatchItemWriter. PartitionReader code below:

public class PartitionReader implements ItemStreamReader<TransferObjectTO> {
    private List<TransferObjectTO> partitionItems;

    public PartitionReader() {
    }

    public synchronized TransferObjectTO read() {
        if(partitionItems.size() > 0) {
            return partitionItems.remove(0);
        } else {
            return null;
        }
    }

    @SuppressWarnings("unchecked")
    @Override
    public void open(ExecutionContext executionContext) throws ItemStreamException {
        partitionItems = (List<TransferObjectTO>) executionContext.get("partitionItems");
    }

    @Override
    public void update(ExecutionContext executionContext) throws ItemStreamException {
        executionContext.put("partitionItems", partitionItems);
    }

    @Override
    public void close() throws ItemStreamException {
    }
}
user313
  • 11
  • 4
  • What does your `JobRepository` look like? What does the database look like? Is the db updated as you'd expect? – Michael Minella Aug 11 '14 at 21:41
  • I've added some more configuration details above. The stepContext after failure looks like it has not rolled back, so it is missing 50 items (/work manager). Same goes with the jobRepository, it has recorded readCount of 150, which should have been 100 as the exception was thrown on 101st read on each thread. I've also run the same slave step in single thread mode (without partitioning) and in that case the restart is working as expected and no items are skipped. – user313 Aug 12 '14 at 05:52
  • Perhaps I should also mention that I am using SpringBatch 2.1.7, Spring 3.2.0 and WebSphere AS 8.5. – user313 Aug 12 '14 at 11:29
  • Can you post the code for the reader? – Michael Minella Aug 12 '14 at 19:19
  • Added PartitionReader java-code above. – user313 Aug 13 '14 at 05:32
  • We were able to solve the problem, so I wrote answer where I tried to clarify what I think went wrong in the first place. Thank you Michael for your effort trying to help. If you have the time, maybe you could evaluate my statements in the answer, so I could get confirmation am I right or wrong about those. – user313 Aug 13 '14 at 11:35

1 Answers1

0

It seems that I had few misunderstandings of SpringBatch and my buggy code. The first misunderstanding was that I thought that the readCount would be rolled back on RuntimeException. Now I see that this is not the case, but when SpringBatch is incrementing this value and upon step failure, the value is committed.

Related to above, I thought that the update method on ItemStreamReader would be always called, but the executionContext update to database would just be committed or rolled back. But it seems that the update is called only if no errors occur and the executionContext update is always committed.

The third misunderstanding was that the partitioning "master step" would not be re-executed on restart, but only slave steps are re-executed. But actually "master step" is re-executed if "master step"'s slave step would fail. So I guess that master and slave steps are actually somehow handled as a single step.

And then there was my buggy code in the PartitionReader, which was supposed to save db-server disk space. Maybe the partitionItems should not be edited on next()? (Related to the above statements) Anyhow here is the code for the working PartitionReader:

public class PartitionReader implements ItemStreamReader<TransferObjectTO> {
    private List<TransferObjectTO> partitionItems;
    private int index;

    public PartitionReader() {
    }

    public synchronized TransferObjectTO read() {
        if(partitionItems.size() > index) {
            return partitionItems.get(index++);
        } else {
            return null;
        }
    }

    @SuppressWarnings("unchecked")
    @Override
    public void open(ExecutionContext executionContext) throws ItemStreamException {
        partitionItems = (List<TransferObjectTO>) executionContext.get("partitionItems");
        index = executionContext.getInt("partitionIndex", 0);
    }

    @Override
    public void update(ExecutionContext executionContext) throws ItemStreamException {
        executionContext.put("partitionIndex", index);
    }

    @Override
    public void close() throws ItemStreamException {
    }
}
user313
  • 11
  • 4