0

I read in documentation, multi-threaded step is not safe to be used as many of the ItemReaders and Writers are not thread safe.

I am using FlatFileItemReader to read and process items.

In case of the file where there are huge number of items to be processed, I am using remote partitioning.

But some of the steps has input file with just 2-3 items (they are just market ids e.g eu, gb, etc). I will be adding few more markets. I need to run some commands which takes these market ids as input. I want to run commands in parallel for all of them.

Is multithreaded step safe to use for such use case even though I am using FlatFileItemReader ? or I should go for remote partitioning (there is not much data to be partitioned) ?

Also if I use multi-threaded step, will it run properly I launch multiple instances of same job with different parameters e.g. different dates ?

vishal
  • 3,993
  • 14
  • 59
  • 102

1 Answers1

3

The FlatFileItemReader is not thread safe in that it's state is based on the number of rows in the file have been read. When using it with multiple threads, that number gets overwritten and so there isn't a way to know what has and has not been read on a restart. If restartability is not an issue (you're ok with starting from the beginning if the job fails), then you can use multiple threads in a step with the FlatFileItemReader.

Michael Minella
  • 20,843
  • 4
  • 55
  • 67
  • Can we use ListItemReader as we can easily create list for limited number of items ? is ListItemReader thread safe and will that be restartable ? – vishal Jun 13 '14 at 05:13
  • One more thing, I am using MultiThreadedFlatFileItemReader in remote partioning. I have 6 partition steps running on each server. What would be better approach, using this or to break file in to number of files equal to grid size ? I cant see this as a part of core spring batch, but it is available here https://github.com/sshcherbakov/spring-batch-talk/blob/master/src/main/java/org/springframework/batch/item/file/MultiThreadedFlatFileItemReader.java . – vishal Jun 13 '14 at 05:20
  • 1
    ListItemReader is not stageful so it is not restartable. I'm unfamiliar with MultiThreadedFlatFileItemReader (it's not a Spring Batch ItemReader implementation) so I'm not sure what to say about it. I'd recommend just using the regular FlatFileItemReader with remote partitioning. Since each partition is single threaded, you get restartability. – Michael Minella Jun 17 '14 at 16:56
  • I am using FlatFileItemReader as mentioned in above question, file has 3 items, each item takes 15 min to process. If 2 items processed successfully and 3rd failed, and if I restart, the step doesnt process 3rd and just mark step as completed. Ideally it should process 3rd item. I am fine if it process all items if any one fails. Is there any way to make sure it process all if any fails ? – vishal Jul 12 '14 at 10:59
  • How is you're job repository configured? Have you configured save-state to be false (which would be bad for your use case)? I'll need to see your configuration to be of more help. – Michael Minella Jul 14 '14 at 14:49