I'm new to Spring Batch and I'm kinda lost with a need I have to fulfill with this batch job and I was wondering if any of you could enlighten me about how can I properly do things right in this case.
My need here is basically receive a text file containing multiple lines with fixed-length delimited fields lines and compose them into a single POJO, which will be then sent to another application via REST to be persisted using Spring Data JPA, and after the persistence finished, this job will write a .txt file with it's processing status.
I already have this routine working with plain Java implementation using BeanIO but I am required to use Spring Batch in this requirement, so some things have to change.
I have a .txt file with the following pattern:
00XXXXX...b
01XXXXX...n
02XXXXX...n
01XXXXX...n
02XXXXX...n
03XXXXX...n
99XXXXX...n
And think of my POJO structure as:
public class POJO {
private String headerId; // Data from record 00
private List<Child> children; // Every entry for record 01
private String trailerId; // Data from record 00
}
public class Child {
private String headerId; // Data from record 00
private String childId; // Data from record 01
private String name; // Another data from record 01
private ChildAttribute attr; // Entry for record 02 following record 01
private AnotherChildAttribute attr; // Entry for record 03 following record 01
}
public class ChildAttribute {
private String childId; // Data from record 01
private String name; // Data from record 02
}
Now, the best I could do so far in Spring Batch is creating a single-step job with a FlatFileItemReader which implements multiple LineTokenizer for each type of record (which will later be replaced by BeanIO), passes the data from the proper processor and then writes the file to another file.
@Bean
@StepScope
public ItemStreamReader<Person> reader(@Value("#{jobParameters['filePath']}") String filePath) throws Exception {
return new FlatFileItemReaderBuilder<Person>()
.name("reader")
.resource(resourceLoader.getResource(filePath))
.lineMapper(personLineMapper())
.build();
}
@Bean
public LineMapper<Person> personLineMapper() {
DefaultLineMapper<Person> mapper = new DefaultLineMapper<Person>();
mapper.setLineTokenizer(personLineTokenizer());
mapper.setFieldSetMapper(new PersonFieldSetMapper());
return mapper;
}
//Sample.. I already have more complex tokenizers implemented
@Bean
public LineTokenizer personLineTokenizer() {
FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
tokenizer.setColumns(new Range[] { new Range(1, 7), new Range(8, 14) });
tokenizer.setNames(new String[] { "firstName", "lastName" });
return tokenizer;
}
@Bean
public ItemProcessor<Person, Person> processor() {
return new PersonItemProcessor();
}
@Bean
public ItemWriter<Person> writer() {
/* Writer */
}
@Bean
public Job ingestJob() throws Exception {
return jobBuilderFactory.get("ingestJob")
.incrementer(new RunIdIncrementer())
.flow(step1())
.end()
.build();
}
@Bean
public Step step1() throws Exception {
return stepBuilderFactory.get("ingest")
.<Person, Person>chunk(10)
.reader(reader(null))
.processor(processor())
.writer(writer())
.build();
}
So what I need here is to, instead of reading, processing and writing line to line, compounding my Pojo as I read each line, and only call the persistence layer after I finish reading the last line.
What's the best approach to achieve this need? Also, any piece of code you guys have will be really appreciated!
Thanks for your attention.
Best regards, Enrico