1

I'm trying to run JAVAEE7 batch on multithread using partition.
My batch is simple: read a bunch of random numbers, write out the sum of them using 3 threads.

My Job XML

<job id="partition" xmlns="http://xmlns.jcp.org/xml/ns/javaee"
    version="1.0">
    <step id="process" next="cleanup">
        <chunk item-count="3">
            <reader ref="partitionProcessIR">
                <properties>
                    <property name="start" value="#{partitionPlan['start']}" />
                    <property name="end" value="#{partitionPlan['end']}" />
                </properties>
            </reader>
            <processor ref="partitionProcessIP" />
            <writer ref="partitionProcessIW" />
        </chunk>
        <partition>
            <mapper ref="partitionMapperImpl" />
        </partition>
    </step>
    <step id="cleanup">
        <batchlet ref="partitionCleanupBatchlet"></batchlet>
    </step>
</job>

My PartitionMapperImpl:

@Override
 public PartitionPlan mapPartitions() throws Exception {
     // TODO Auto-generated method stub
     return new PartitionPlanImpl() {

         @Override
         public int getPartitions() {
             return 3;
         }

         @Override
         public int getThreads() {
             return 3;
         }

         @Override
         public Properties[] getPartitionProperties() {
             int totalRecords = getTotalRecords();
             int partItems = totalRecords / getPartitions();
             int remainItems = totalRecords % getPartitions();
             Properties[] props = new Properties[getPartitions()];

             for (int i = 0; i < getPartitions(); i++) {
                 props[i] = new Properties();
                 props[i].setProperty("start", String.valueOf(i * partItems));
                 // if this is the last partition, add remaining items
                 if (i == getPartitions() - 1) {
                     props[i].setProperty("end", String.valueOf((i + 1) * partItems + remainItems));
                 } else {
                     props[i].setProperty("end", String.valueOf((i + 1) * partItems));
                 }
             }
             return props;
         }
     };
 }

 private int getTotalRecords() {
     return 50;
 }

My Reader:

@Override
public void open(Serializable checkpoint) throws Exception {
    int start = new Integer(startProperty);
    int end = new Integer(endProperty);
    List<Integer> listNumber = new ArrayList<>();
    for (int i = start; i < end; i++) {
        int rand = (int) (Math.random() * 10);
        listNumber.add(rand);
    }
    iterator = listNumber.iterator();
}

@Override
public Integer readItem() throws Exception {
    if (iterator.hasNext()) {
        return iterator.next();
    }
    // end read
    return null;
}

My Processor

@Override
    public Integer processItem(Object arg0) throws Exception {
        Integer rand = (Integer) arg0;
        return rand;
    }

My Writer

@Override
    public void writeItems(List<Object> arg0) throws Exception {
        int sum = 0;
        for (Object object : arg0) {
            Integer rand = (Integer) object;
            sum += rand;
        }
        System.out.println(Thread.currentThread().getId() + " | SUM OF CHUNK: " + sum);
    }

When I run this batch, the following error occured. I'm guessing this has something to do with storing serveral checkpoints at them same time in the derby database.

2017-03-02T15:22:45.955+0700|情報: 275 | SUM OF CHUNK: 13 2017-03-02T15:22:45.958+0700|情報: 316 | SUM OF CHUNK: 17 2017-03-02T15:23:05.971+0700|重大: Failure in Read-Process-Write Loop com.ibm.jbatch.container.exception.BatchContainerServiceException: Cannot persist the checkpoint data for [process] at com.ibm.jbatch.container.persistence.CheckpointManager.checkpoint(CheckpointManager.java:133) at com.ibm.jbatch.container.impl.ChunkStepControllerImpl.invokeChunk(ChunkStepControllerImpl.java:644) at com.ibm.jbatch.container.impl.ChunkStepControllerImpl.invokeCoreStep(ChunkStepControllerImpl.java:764) at com.ibm.jbatch.container.impl.BaseStepControllerImpl.execute(BaseStepControllerImpl.java:144) at com.ibm.jbatch.container.impl.ExecutionTransitioner.doExecutionLoop(ExecutionTransitioner.java:112) at com.ibm.jbatch.container.impl.JobThreadRootControllerImpl.originateExecutionOnThread(JobThreadRootControllerImpl.java:110) at com.ibm.jbatch.container.util.BatchWorkUnit.run(BatchWorkUnit.java:80) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at org.glassfish.enterprise.concurrent.internal.ManagedFutureTask.run(ManagedFutureTask.java:141) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) Caused by: com.ibm.jbatch.container.exception.PersistenceException: java.sql.SQLTransactionRollbackException: ????????????????????????????????????: Lock : ROW, CHECKPOINTDATA, (110,27) Waiting XID : {77885156, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885155, X} Lock : ROW, CHECKPOINTDATA, (110,28) Waiting XID : {77885155, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885156, X} ????????XID: 77885156? at fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.queryCheckpointData(JBatchJDBCPersistenceManager.java:503) at fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.updateCheckpointData(JBatchJDBCPersistenceManager.java:388) at fish.payara.jbatch.persistence.rdbms.LazyBootPersistenceManager.updateCheckpointData(LazyBootPersistenceManager.java:230) at com.ibm.jbatch.container.persistence.CheckpointManager.checkpoint(CheckpointManager.java:128) ... 13 more Caused by: java.sql.SQLTransactionRollbackException: ????????????????????????????????????: Lock : ROW, CHECKPOINTDATA, (110,27) Waiting XID : {77885156, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885155, X} Lock : ROW, CHECKPOINTDATA, (110,28) Waiting XID : {77885155, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885156, X} ????????XID: 77885156? at org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown Source) at org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source) at org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.closeOnTransactionError(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.movePosition(Unknown Source) at org.apache.derby.impl.jdbc.EmbedResultSet.next(Unknown Source) at com.sun.gjc.spi.base.ResultSetWrapper.next(ResultSetWrapper.java:103) at fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.queryCheckpointData(JBatchJDBCPersistenceManager.java:498) ... 16 more Caused by: java.sql.SQLException: ????????????????????????????????????: Lock : ROW, CHECKPOINTDATA, (110,27) Waiting XID : {77885156, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885155, X} Lock : ROW, CHECKPOINTDATA, (110,28) Waiting XID : {77885155, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885156, X} ????????XID: 77885156? at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source) at org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown Source) ... 27 more Caused by: ERROR 40001: ????????????????????????????????????: Lock : ROW, CHECKPOINTDATA, (110,27) Waiting XID : {77885156, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885155, X} Lock : ROW, CHECKPOINTDATA, (110,28) Waiting XID : {77885155, S} , APP, select id, obj from CHECKPOINTDATA where id = ? Granted XID : {77885156, X} ????????XID: 77885156? at org.apache.derby.iapi.error.StandardException.newException(Unknown Source) at org.apache.derby.impl.services.locks.Deadlock.buildException(Unknown Source) at org.apache.derby.impl.services.locks.ConcurrentLockSet.lockObject(Unknown Source) at org.apache.derby.impl.services.locks.ConcurrentLockSet.zeroDurationLockObject(Unknown Source) at org.apache.derby.impl.services.locks.AbstractPool.zeroDurationlockObject(Unknown Source) at org.apache.derby.impl.services.locks.ConcurrentPool.zeroDurationlockObject(Unknown Source) at org.apache.derby.impl.store.raw.xact.RowLocking2nohold.lockRecordForRead(Unknown Source) at org.apache.derby.impl.store.access.conglomerate.OpenConglomerate.lockPositionForRead(Unknown Source) at org.apache.derby.impl.store.access.conglomerate.GenericScanController.fetchRows(Unknown Source) at org.apache.derby.impl.store.access.heap.HeapScan.fetchNextGroup(Unknown Source) at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.reloadArray(Unknown Source) at org.apache.derby.impl.sql.execute.BulkTableScanResultSet.getNextRowCore(Unknown Source) at org.apache.derby.impl.sql.execute.BasicNoPutResultSetImpl.getNextRow(Unknown Source) ... 20 more

Do you have any ideas how to fix this?
Or any sample which can run on more than 2 threads is really helpful.
Thanks in advance.

NamNVH
  • 11
  • 2

2 Answers2

0

It looks to me as though you might be having concurrency problems, such as deadlocks or lock timeouts. (It's a bit hard to tell because your exception information is a bit garbled in the question and, I think, because the Derby messages are being printed in a mixture of native language strings and English strings).

You can find some strategies for diagnosing and understanding why your concurrent database access is experiencing these problems here: https://wiki.apache.org/db-derby/LockDebugging

Bryan Pendleton
  • 16,128
  • 3
  • 32
  • 56
0

Looks like a Payara issue, from this line in stacktrace:

fish.payara.jbatch.persistence.rdbms.JBatchJDBCPersistenceManager.queryCheckpointData(JBatchJDBCPersistenceManager.java:503)

You can try running your app with GlassFish proper, and see if you have the same issue.

Or you can deploy the app to WildFly, which contains JBeret as the batch container. If your app is written to the JSR 352 spec, it should deploy and run in any Java EE 7 compliable application servers. You can configure WildFly to use jdbc job repository with Derby, or any other supported DBMS, including the bundled H2 database.

If you are still stucked, I suggest following up with Payara project.

cheng
  • 1,076
  • 6
  • 6