0

Due to limitations of integrating with existing application we need to use a separate database connection per chunk and manage the commit boundary one commit at the end of chunk. We designed to use remote partitioning and process multiple partitions at workers. partition step expected to execute chunks sequentially.

We tried an approach of combining Chink Listener and processor, connection obtained at before chunk listener method as instance variable and using it while processing,

but this connection being replaced right after the first item processed by beforeChunk method. Used Resource less Transaction manager in this case. How ever this transaction manager is not recommended to use for production. We are expected to process chunks in parallel and not expected this. We have also observed that when we hold the connection at writer by beforeChunk method is getting closed when write method called.

Second approach by using DataSourceTransactionManager , but not sure how to get the connection from transaction used at chunk level.

Step Configuration as follows:-

<step id="senExtractGeneratePrintRequestWorkerStep" xmlns=http://www.springframework.org/schema/batch>
                                <tasklet>
                                                <chunk reader="senExtractGeneratePrintRequestWorkerItemReader"
                                                                processor="senExtractGeneratePrintRequestWorkerItemProcessor"
                                                                writer="senExtractGeneratePrintRequestWorkerMultiItemWriter"
                                                                commit-interval="${senExtractGeneratePrintRequestWorkerStep.commit-interval}" 
                                                                skip-limit="${senExtractGeneratePrintRequestWorkerStep.skip-limit}">
                                                                <skippable-exception-classes>
                                                                                <batch:include class="java.lang.Exception" />
                                                                </skippable-exception-classes>
                                                </chunk>
                                                <listeners>
                                                                <listener ref="senExtractGeneratePrintRequestWorkerItemProcessor" />
                                                </listeners>
                                </tasklet>
</step>
<bean id="senExtractGeneratePrintRequestWorkerItemProcessor"
        scope="step"
        class="com.abc.batch.senextract.worker.SENExtractGeneratePrintRequestItemProcessor"/>
    

Connection is closed by datasource before write is called. Screen shots of call hierarchy as below. enter image description here

  • I'm trying to help but your question and your requirement are not clear to me. Why do you need a database connection per chunk? This means if you have 10.000 chunks, a single job will open/close 10.000 connections. Is this really what you are looking for? I don't see a valid use case for that, but if you really need to do it, you need a custom step implementation for that because the chunk-oriented Tasklet implementation provided by default in Spring Batch uses a single database connection. I'm curious about what kind of data processing job you are designing to have this kind of requirements. – Mahmoud Ben Hassine May 27 '21 at 05:18
  • Hi @MahmoudBenHassine Thanks for your reply, the use case for us here is that we are using legacy application fro processing which needs database connection with auto commit FALSE to be passed as primary attribute , where application code is internally using it's own way of managing business scenarios. Read and Write Managed separately outside this in Spring Batch for processing we are calling this. So we are trying to check ways for using same DB connection per step and commit at chunk level. we Just solved this by creating – Krishna Kishore May 27 '21 at 11:59
  • we are ok to have the connection at step level, solved this by creating custom object with DB connection at beforeStep and using same object in processor via parameter injection. performing commit in afterChunk and afterStep method in listener's to not hold too many commits/minimising failure rollback to chunk level. Please share your views about this approach. – Krishna Kishore May 27 '21 at 12:11

0 Answers0