0

I am working on remote partitioning using spring batch.. I have multiple instances of spring batch application running. As part of use case I need to process all records present in database. So I am assigning instances to every records.. So each instance of batch application can process instance specific records.

e.g.

Records1 -: instance1
Records2 -: instance1
Records3 -: instance2
Records4 -: instance2
.... so on

I know we can use Kafka or JMS instead of partitioning explicitly. In my use case I don't want to use any messaging middleware. So after assigning instances to each records. I want to invoke both instances. How can I invoke both instances ?

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50

1 Answers1

0

You still need a messaging middleware to implement remote partitioning. This is required for the manager step to send StepExecutionRequests to worker steps.

If you really want to avoid having a messaging middleware, you can create a job instance per record set. According to your example, this could be designed as follows:

  • Job instance 1 -> Records1 and Records2
  • Job instance 2 -> Records3 and Records4
  • etc

Those job instances can be executed locally in different threads of the same JVM, or locally in different JVMs on the same machine, or remotely on different machines in a cluster. The degree of parallelism depends only on the resources available in your cluster. You can have, on a single machine, several JVMs where each JVM is running several job instances in parallel. You could then duplicate that setup on different machines and achieve a high level of parallelism for you requirement.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
  • how can we invoke specific application job instance ? any reference ? We can get all instances from eureka –  Aug 17 '22 at 14:11
  • Do not confuse a job instance with an application instance. What I mean is that you can run different job instances in parallel. A job instance is identified by a set of identifying job parameters, please check the docs here: [Batch Domain Language](https://docs.spring.io/spring-batch/docs/current/reference/html/domain.html#jobinstance). In your case, this could be the way you identify `RecordsX` (interval of keys, table partitions, etc). – Mahmoud Ben Hassine Aug 18 '22 at 08:32
  • @mahamoud... Could you please share any reference of creating job instances per records and job instances can be run in different JVMs (application cluster). –  Sep 02 '22 at 05:32
  • I don't have any specific example, but the idea is that you need a way to specify distinct record sets and use that as an identifying job parameter, which will in turn create different job instances. An example would be: `java -jar myjob.jar start=1 end=1000`, `java -jar myjob.jar start=1001 end=2000`, etc. – Mahmoud Ben Hassine Sep 02 '22 at 06:50