In the following, are my writes being commited to the database 1000 at a time -- because my commit interval in my Spring Batch job is set at 1000? The MyBatis SqlSessionFactory is defined as a BATCH execution:
<bean id="sqlSession" class="org.mybatis.spring.SqlSessionTemplate">
<constructor-arg index="0" ref="sqlSessionFactory" />
<constructor-arg index="1" value="BATCH" />
</bean>
<!-- define the SqlSessionFactory -->
<bean id="sqlSessionFactory" class="org.mybatis.spring.SqlSessionFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="typeAliasesPackage" value="org.my.domain" />
</bean>
I notice in the DEBUG logs:
2015-08-21 22:58:54,632 [main] DEBUG org.mybatis.spring.SqlSessionUtils - Fetched SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@174b870c] from current transaction
In the above DEBUG statement I think it is opening the connection. And then below the insert statements are inserted in one batch of 1,000 -- or did it open a new connection to the MS SQL server for each insert?
2015-08-21 22:58:54,632 [main] DEBUG org.mybatis.spring.SqlSessionUtils - Fetched SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@174b870c] from current transaction
KickoutMapper.insertKickoutTbl - ==> Parameters: 12143(Long), 10039(Long), 0(Integer), SUBSCRIBER4998(String), .....
2015-08-21 22:58:54,632 [main] DEBUG org.mybatis.spring.SqlSessionUtils - Releasing transactional SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@174b870c]
and the end it is closing the transaction (and again I am not sure if it is closing the connection to the MS SQL server or if it was doing that after each insert). Again I'd like to insert 1,000 records in batch, with one connection to the database.
2015-08-21 22:58:55,376 [main] DEBUG org.mybatis.spring.SqlSessionUtils - Releasing transactional SqlSession [org.apache.ibatis.session.defaults.DefaultSqlSession@174b870c]
Also, can the AsyncItemWriter delegate to a JdbcBatchItemWriter instead and would it be faster then the MyBatis I have configured? In my code below, I delegate to a CompositeItemWriter that writes to two separate tables using MyBatis mappers.
My use case is this: I need to read and then validate 7 million records from a mec_mdw database table.
The invalid mec_mdw records, that are deemed invalid by the processor, will be inserted into mec_kickout table. Also, the validation failure reason for that particular record will be inserted into mec_kickout_reason table. The mec_mdw table that I originally read from will also have a few columns updated at the end of processing.
Here is what I have so far using asynchronous processing and writing, currently I'm testing with only 2,500 mdw records that are read, and in this test all 2,500 records are purposely invalid and thus are being inserted into the mec_kickout table, and the original mdw tables 2,500 rows are also being updated. All this is done in roughly 50 seconds, on a 8 core CPU laptop, with 16 Gb of RAM, and a MS SQL server database is a network call away. But still I don't fully understand if it can be done faster.
<job id="mecmdwvalidatorJob" xmlns="http://www.springframework.org/schema/batch">
<step id="mdwvalidatorStep1">
<tasklet>
<chunk reader="pageItemReader" processor="asyncItemProcessor"
writer="asynchItemWriter" commit-interval="1000" skip-limit="2147483647">
<skippable-exception-classes> <!-- TODO -->
<include class="java.lang.Exception" />
</skippable-exception-classes>
</chunk>
</tasklet>
</step>
</job>
<bean id="pageItemReader"
class="org.springframework.batch.item.database.JdbcPagingItemReader">
<property name="dataSource" ref="dataSource" />
<property name="queryProvider">
<bean
class="org.springframework.batch.item.database.support.SqlPagingQueryProviderFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="selectClause"
value="select MDW_ID,FK_LOG_FILE_ID,TAX_YEAR,SUBS_TYPE_CD,SUB_FIRST_NM,SUB_MIDDLE_NM,SUB_LAST_NM,SUB_SUFFIX,SUB_DOB,SUB_ADDR1,SUB_ADDR2,SUB_CITY,SUB_STATE,SUB_PROVINCE,SUB_ZIP,SUB_ZIP4,SUB_COUNTRY_CD,SUB_COUNTRY,SUB_F_POSTAL_CD,LOB,SUB_SSN,GRP_EMP_NAME1,GRP_EMP_NAME2,GRP_EIN,GRP_ADDR1,GRP_ADDR2,GRP_CITY,GRP_STATE,GRP_PROVINCE,GRP_ZIP,GRP_ZIP4,GRP_COUNTRY_CD,GRP_COUNTRY,GRP_F_POSTAL_CD,ISSUER_NAME1,ISSUER_NAME2,ISSUER_PHONE,ISSUER_ADDR1,ISSUER_ADDR2,ISSUER_CITY,ISSUER_PROVINCE,ISSUER_ZIP,ISSUER_ZIP4,ISSUER_COUNTRY_CD,ISSUER_COUNTRY,ISSUER_F_POSTAL_CD,MEM_FIRST_NM,MEM_MIDDLE_NM,MEM_LAST_NM,MEM_SUFFIX,MEM_SSN,MEM_DOB,MEM_START_DATE,MEM_END_DATE,REGION_CD,SUB_MRN,SUB_MRN_PREFIX,MEM_MRN,MRN_PREFIX,PID,SUB_GRP_ID,SUB_GRP_NAME,INVALID_ADDR_FL" />
<property name="fromClause"
value="from MEC_MDW JOIN MEC_FILE_LOG on MEC_FILE_LOG.LOG_FILE_ID=MEC_MDW.FK_LOG_FILE_ID " />
<property name="whereClause" value="where MEC_FILE_LOG.STATUS=:status" />
<property name="sortKey" value="MDW_ID" />
</bean>
</property>
<property name="parameterValues">
<map>
<entry key="status" value="READY TO VALIDATE" />
</map>
</property>
<property name="pageSize" value="1000" />
<property name="rowMapper" ref="mdwRowMapper" />
</bean>
<bean id="mdwRowMapper" class="org.my.rowmapper.MdwRowMapper" />
<bean id="asyncItemProcessor"
class="org.springframework.batch.integration.async.AsyncItemProcessor">
<property name="delegate">
<bean
class="org.my.itemprocessor.MdwValidatingItemProcessor">
<property name="validator">
<bean
class="org.springframework.validation.beanvalidation.LocalValidatorFactoryBean" />
</property>
</bean>
</property>
<property name="taskExecutor" ref="taskExecutor" />
<!-- <property name="taskExecutor"> -->
<!-- <bean class="org.springframework.core.task.SimpleAsyncTaskExecutor"
/> -->
<!-- </property> -->
</bean>
<task:executor id="taskExecutor" pool-size="10" />
<bean id="asynchItemWriter"
class="org.springframework.batch.integration.async.AsyncItemWriter">
<property name="delegate" ref="customerCompositeWriter">
</property>
</bean>
<bean id="customerCompositeWriter"
class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<ref bean="itemWriter1" />
<ref bean="itemWriter2" />
</list>
</property>
</bean>
<bean id="itemWriter1" class="org.my.writer.MdwWriter" />
<bean id="itemWriter2" class="org.my.writer.KickoutWriter" />
</beans>
The processor will have the business logic, right now it only has the bean validation looking for null properties in the MecMdw domain object. Eventually, I will also need adapter code to look up account ids in other tables (so more database connectivity joy!). I think this adapter DAO logic will go in the processor
public class MdwValidatingItemProcessor implements ItemProcessor<MecMdw, MecMdw> {
private Validator validator;
public void setValidator(Validator validator) {
this.validator = validator;
}
public MecMdw process(MecMdw item) throws Exception {
BindingResult results = BindAndValidate(item);
if (results.hasErrors()) {
item.setKick_out_fl('Y');
buildValidationException(results,item);
return item;
}
return item;
}
private BindingResult BindAndValidate(MecMdw item) {
DataBinder binder = new DataBinder(item);
binder.setValidator(validator);
binder.validate();
return binder.getBindingResult();
}
private void buildValidationException(BindingResult results, MecMdw item) {
List<String> listOfErrors = new ArrayList<String>();
for (ObjectError error : results.getAllErrors()) {
listOfErrors.add(error.toString());
}
item.setValidationErrors(listOfErrors);
}
The MdwWriter and KickoutWriter are using MyBatis DAO to write to the database.
public class MdwWriter<MecMdw> implements ItemWriter<MecMdw> {
@Autowired
MdwMapper mdwMapper;
@Override
public void write(List<? extends MecMdw> items) throws Exception {
for(MecMdw item : items){
mdwMapper.setMecMdwRecordAsKickOut((org.my.domain.MecMdw) item);
}
}
Here is the KickoutWriter.java
public class KickoutWriter<MecMdw> implements ItemWriter<MecMdw> {
@Autowired
KickoutMapper kickoutMapper;
@Override
public void write(List<? extends MecMdw> items) throws Exception {
for(MecMdw item : items){
kickoutMapper.insertKickoutTbl((org.my.domain.MecMdw) item);
}
}