1

i am trying to understand "multi threaded step" which is one of the ways in spring batch to implement parallel processing other parallel steps and partitioning.

  • My question is related to reader.For example , lets assume there is a file with 1000 records ,chunk size is 100, number of threads are 4. So in this case there will be 10 chunks and each thread will be given a chunk to start and when its finished remaining chunks will be assigned to the threads and at one point of time only 4 chunks will be processed by 4 threads. But how does threads decided which data to read? Lets say if first thread is already working on 100 records , how come the second thread know that it should not pick the same records and look for records which are not picked up other threads.

  • In this case there will there be single instance of Reader and writer which is shared between the threads? If yes then any class level resource are not thread safe?

Thanks,

Sanjay
  • 165
  • 1
  • 13

1 Answers1

0

But how does threads decided which data to read?

This is undefined. Items will be read in a non deterministic order. That's why it is not a good idea to use a multi-threaded step when the read order between record matters.

In this case there will there be single instance of Reader and writer which is shared between the threads? If yes then any class level resource are not thread safe?

Yes, those will be shared between threads. The javadoc of each reader/writer mentions if the reader/writer is thread safe or not.

For more details, please refer to the Multi-threaded Step section of the reference documentation.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
  • I am still not sure who manages this assignment of chunks to a thread? Is it correct that step is using reader to get all the chunks and then assigning it to thread individually? In that case reader is used before any thread starts and thread only performs processor and writer functions. – Sanjay Apr 26 '20 at 15:54