2

I'm new to Spring Batch and I'm kinda lost with a need I have to fulfill with this batch job and I was wondering if any of you could enlighten me about how can I properly do things right in this case.

My need here is basically receive a text file containing multiple lines with fixed-length delimited fields lines and compose them into a single POJO, which will be then sent to another application via REST to be persisted using Spring Data JPA, and after the persistence finished, this job will write a .txt file with it's processing status.

I already have this routine working with plain Java implementation using BeanIO but I am required to use Spring Batch in this requirement, so some things have to change.

I have a .txt file with the following pattern:

00XXXXX...b
01XXXXX...n
02XXXXX...n
01XXXXX...n
02XXXXX...n
03XXXXX...n
99XXXXX...n

And think of my POJO structure as:

public class POJO {

    private String headerId; // Data from record 00
    private List<Child> children; // Every entry for record 01
    private String trailerId; // Data from record 00    
}

public class Child {
    private String headerId; // Data from record 00
    private String childId; // Data from record 01
    private String name; // Another data from record 01
    private ChildAttribute attr; // Entry for record 02 following record 01
    private AnotherChildAttribute attr; // Entry for record 03 following record 01
}

public class ChildAttribute {
    private String childId; // Data from record 01
    private String name; // Data from record 02
}

Now, the best I could do so far in Spring Batch is creating a single-step job with a FlatFileItemReader which implements multiple LineTokenizer for each type of record (which will later be replaced by BeanIO), passes the data from the proper processor and then writes the file to another file.

@Bean
@StepScope
public ItemStreamReader<Person> reader(@Value("#{jobParameters['filePath']}") String filePath) throws Exception {
    return new FlatFileItemReaderBuilder<Person>()
            .name("reader")
            .resource(resourceLoader.getResource(filePath))
            .lineMapper(personLineMapper())
            .build();
}

@Bean
public LineMapper<Person> personLineMapper() {
    DefaultLineMapper<Person> mapper = new DefaultLineMapper<Person>();
    mapper.setLineTokenizer(personLineTokenizer());
    mapper.setFieldSetMapper(new PersonFieldSetMapper());
    return mapper;
}

//Sample.. I already have more complex tokenizers implemented
@Bean
public LineTokenizer personLineTokenizer() {
    FixedLengthTokenizer tokenizer = new FixedLengthTokenizer();
    tokenizer.setColumns(new Range[] { new Range(1, 7), new Range(8, 14) });
    tokenizer.setNames(new String[] { "firstName", "lastName" });
    return tokenizer;
}


@Bean
public ItemProcessor<Person, Person> processor() {
    return new PersonItemProcessor();
}

@Bean
public ItemWriter<Person> writer() {
    /* Writer */
}

@Bean
public Job ingestJob() throws Exception {
    return jobBuilderFactory.get("ingestJob")
        .incrementer(new RunIdIncrementer())
        .flow(step1())
        .end()
        .build();
}

@Bean
public Step step1() throws Exception {
    return stepBuilderFactory.get("ingest")
        .<Person, Person>chunk(10)
        .reader(reader(null))
        .processor(processor())
        .writer(writer())
        .build();
}

So what I need here is to, instead of reading, processing and writing line to line, compounding my Pojo as I read each line, and only call the persistence layer after I finish reading the last line.

What's the best approach to achieve this need? Also, any piece of code you guys have will be really appreciated!

Thanks for your attention.

Best regards, Enrico

Enrico Bergamo
  • 163
  • 1
  • 14
  • BeanIO support Spring Batch (http://beanio.org/2.1/docs/reference/index.html#SpringBatch); maybe you can easy reuse your old code – Luca Basso Ricci Mar 19 '18 at 12:58
  • Thanks for the hint, Luca. That will save me some time. But still, having each line read and structured isn't my main concern. I am still clueless on how to compound the ONE parent object with the content of all the upcoming lines. – Enrico Bergamo Mar 19 '18 at 13:07
  • I used BeanIO just a bit but I'm pretty sure you can config BeanIO to aggregate CVS to compound objects automatically (BeanIO supports record grouping). Otherway you have to write a custom item reader and aggregate data by hand (as described in https://stackoverflow.com/questions/35049613/spring-batch-processing-multiple-record-at-once) – Luca Basso Ricci Mar 19 '18 at 13:23
  • Thanks! I've already seen this answer of yours before and completely forgot it. I'll try aggregating with BeanIO before tho. Lemme just ask you this to make sure I can test it properly: Since I can get different pojos depending on the record I'm reading, both Reader/Processor/Writer should be , correct? – Enrico Bergamo Mar 19 '18 at 13:31
  • Your intent is to aggregate lines to a single object of type POJO so reader/processor/writer should be – Luca Basso Ricci Mar 19 '18 at 13:37
  • Luca, after a further research, it seems BeanIO segment can only aggregate collections for a single row. So I won't be able to read the first line and aggregate the upcoming lines as child objects without any code for assignment. I also couldn't manage to get both beanIO and batch running together due NoSuchMethod error. – Enrico Bergamo Mar 19 '18 at 18:19
  • Also, I'm trying to get the AggregateItemReader and I don't think I get how to get it inside (or replace) my FlatFileItemWrite. Have you ever implemented this feature? – Enrico Bergamo Mar 19 '18 at 19:09

0 Answers0