1

I am a real newbie on Spring Batch, so i am trying to solve some Problems and learn something about it. However, i am stuck with one of them.

Imagine two data sources from different suppliers describing the same thing in different formats (for example their stockpile) in CVS. Therefore I am using two different Reader to unify the data in a common object Product. Then i have to accumulate all Products (via their name) and merge the available stocks for each product. Export is a single Report containing all available Products with their available stock numbers (CSV).

How should i partition my Problem for Spring Batch and process over all Elements?

Thanks in advance.

Fritz Duchardt
  • 11,026
  • 4
  • 41
  • 60
calaedo
  • 313
  • 1
  • 3
  • 15
  • 2
    You should add a little bit more details about the two datasources, is it DB vs flat file ? or two DB ? What do you mean by "merge the data" ? (retreive it from the second/first data source and add it into another one ?) What's in the output ? merged and flagged data ? It doesn't sound complicated but having more details would be nice – Asoub Jan 25 '16 at 12:42
  • Please be more specific about your problem and post one question at time; this question is elegible to be closed soon because doesn't respect basilar SO posting principles – Luca Basso Ricci Jan 25 '16 at 12:45
  • @LucaBassoRicci Sorry, concertized my question – calaedo Jan 25 '16 at 12:57
  • 1
    @Asoub: Well It was ment to be more abstract, but now i concertized my question – calaedo Jan 25 '16 at 13:00

1 Answers1

1

Steps in your process:
1) Create a table in embedded(in-memory) database.
2) First step should be truncating this table, for which you can define a tasklet in your job :-

 <batch:step id="truncateTempTableFrOrder" next="readWriteDataOfSource1">
            <batch:tasklet ref="truncateTempTableTasklet" />
        </batch:step>

3) Now, next two steps should just fetch the data from two datasources and write that data into temp table, two steps can be configured as below :

<batch:step id="readWriteDataOfSource1" next="readWriteDataOfSource2">
            <batch:tasklet>
                <batch:chunk reader="dataReader"     writer="dataWriter"
                    commit-interval="100" />
            </batch:tasklet>
        </batch:step>

Use org.springframework.batch.item.database.JdbcCursorItemReader for reading data from datasources and org.springframework.batch.item.database.JdbcBatchItemWriter for writing to database.

4) Now your last step will be reading the data from temp table using the jdbc reader mentioned in last step and then writing the data using org.springframework.batch.item.file.FlatFileItemWriter.

You can perform the data processing in a select query while fetching the data from temp table. How to configure Readers and writers in spring batch, you can refer any good tutorial or Spring Batch in Action (book).

Amit Bhati
  • 5,569
  • 1
  • 24
  • 45
  • You can read http://stackoverflow.com/questions/21304364/spring-batch-job-read-from-multiple-database to join multiple readers in one step – Luca Basso Ricci Jan 25 '16 at 14:59
  • 1
    Yeah, using a database to gather all datas in the same spot sounds good, Then you'll just have to fetch datas ordered by name and check if the current processed item has the same name as the last (keep a reference to the last item in the processor, or use the "step execution context for that if you're also using chunks). You should also check if that's the last item you're processing. – Asoub Jan 26 '16 at 12:38
  • @Asoub, i have solved the _merging_ step via a sql command- as a tasklet. Thanks for your help! – calaedo Jan 26 '16 at 13:26