We have been using spring batch for below use cases
- Read data from file, process and write to target database (batch kicks off when file arrives)
- Read data from remote database, process and write to target database (runs on scheduled interval, triggered by Autosys)
With the plan to move all online apps to spring-boot microservices and PCF, we are looking at doing a similar excercise on the batch side if it adds value.
In the new world, the spring cloud batch job task will be reading the file from S3 storage (ECSS3).
I am looking at good design here (stay away from too many pipes/filters and orchestration if possible), the input data ranges from 1MM to 20MM records
- ECSS3 will notify on file arrival by sending an http request, the workflow would be - clould stram httpsource->launch clould batch job task that will read from object store, process and save records to target database
- Spring Clould Job Task triggered from PCF scheduler to read from remote database, process and save to target database
With the above design, I don't see the value of wrapping the spring batch job into clould task and running in the PCF with spring data flow
Am I missing something here ? Is PCF/SpringClouldDataFlow an overkill in this case ?