0

We have been using spring batch for below use cases

  1. Read data from file, process and write to target database (batch kicks off when file arrives)
  2. Read data from remote database, process and write to target database (runs on scheduled interval, triggered by Autosys)

With the plan to move all online apps to spring-boot microservices and PCF, we are looking at doing a similar excercise on the batch side if it adds value.

In the new world, the spring cloud batch job task will be reading the file from S3 storage (ECSS3).
I am looking at good design here (stay away from too many pipes/filters and orchestration if possible), the input data ranges from 1MM to 20MM records

  1. ECSS3 will notify on file arrival by sending an http request, the workflow would be - clould stram httpsource->launch clould batch job task that will read from object store, process and save records to target database
  2. Spring Clould Job Task triggered from PCF scheduler to read from remote database, process and save to target database

With the above design, I don't see the value of wrapping the spring batch job into clould task and running in the PCF with spring data flow

Am I missing something here ? Is PCF/SpringClouldDataFlow an overkill in this case ?

Nitty
  • 41
  • 1
  • 5

1 Answers1

0

Orchestrating batch-jobs in a cloud setting could bring new benefits to the solution. For instance, the resiliency model that PCF supports could be useful. Spring Cloud Task (SCT) are typically run in a short-lived container; if it goes down, PCF will bring it back up and run in it.

Both the options listed above are feasible and it comes down to the use-case wrt the frequency in which you're processing the incoming data. It is really real-time or it can happily run on a schedule is something you'd have to determine to make the decision.

As for the applicability of Spring Cloud Data Flow (SCDF) + PCF, again, it comes down to your business requirements. You may not be using it now, but Spring Batch Admin is EOL in favor of SCDF's Dashboard. The following questions might help realize the SCDF + SCT value proposition.

Do you have to monitor the overall batch-jobs' status, progress, and health? Maybe you've requirements to assemble multiple batch-jobs as a DAG? How about visually composing a series of Tasks and orchestrate it entirely from the Dashboard?

Also, when the batch-jobs are used together with SCT, SCDF, and PCF Scheduler, you'd get the benefit to monitoring all of this from the PCF Apps Manager.

Sabby Anandan
  • 5,636
  • 2
  • 12
  • 21