Presently i am using springbatch to process csv and excel files in below manner.
- Reader(will parse csv/excel files and male pojo)
- Processor (will hit Db whether this record is there in DB or not )
- Writer(will push the pojo to message queue)
In real time i have 50k + records to process for which my code almost taking 25 minutes. I want to improve processing time by implementing parallel processing(so that in parallel we can process the same in less time).
But i have no clue how to achieve parallel processing with Spring Batch. Can any one guide me how to do it or any suggestions to improve processing time.
@Bean
public TaskExecutor taskExecutor(){
return new SimpleAsyncTaskExecutor("CSV-Async-batch");
}
@Bean(name="csvjob")
public Job job(JobBuilderFactory jobBuilderFactory,StepBuilderFactory stepBuilderFactory,ItemReader<List<CSVPojo>> itemReader,ItemProcessor<List<CSVPojo>,CsvWrapperPojo> itemProcessor,AmqpItemWriter<CsvWrapperPojo> itemWriter){
Step step=stepBuilderFactory.get("ETL-CSV").<List<CSVPojo>,CsvWrapperPojo>chunk(100)
.reader(itemReader)
.processor(itemProcessor)
.writer(itemWriter)
.taskExecutor(taskExecutor())
.throttleLimit(40)
.build();
Job csvJob= jobBuilderFactory.get("ETL").incrementer(new RunIdIncrementer())
.start(step).build();
====Reader for SynchronizedItemStreamReader=================
@Component
public class Reader extends SynchronizedItemStreamReader<List<CSVPojo>> {
public static MultipartFile reqFile=null;
List<CSVPojo> result = new ArrayList<CSVPojo>();
@Autowired
private CSVProcessService csvProcessService;
public static boolean batchJobState ;
/*public Reader(MultipartFile file){
this.reqFile=file;
}*/
public void setDelegate(ItemStreamReader<List<CSVPojo>> delegate){
/*try {
this.read();
} catch (UnexpectedInputException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (NonTransientResourceException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}*/
}
@Override
public List<CSVPojo> read() throws Exception, UnexpectedInputException,
ParseException, NonTransientResourceException {
// TODO Auto-generated method stub
if(!batchJobState){
result=csvProcessService.processCSVFile(reqFile);
System.out.println("in batch job reader");
batchJobState=true;
return result;
}
return null;
}
}
Thanks in advance!!!