0

I use spring batch to load data from a file to a database.The job contains only one step.I use a ThreadPoolTaskExecutor to execute step concurrently.The step is similar to this one.

public Step MyStep(){
    return StepBuilderFactory.get("MyStep")
        .chunk(10000)
        .reader(flatFileItemWriter)
        .writer(jdbcBatchItemWriter)
        .faultTolerant()
        .skip(NumberFormatException.class)
        .skip(FlatFileParseException.class)
        .skipLimit(3)
        .throttleLImit(10)
        .taskExecutor(taskExecutor)
        .build();
}

There are 3 "numberformat" errors in my file,so I set skipLimit 3,but I find that when I execute the job,it will start 10 threads and each thread has 3 skips,so I have 3 * 10 = 30 skips in total,while I only need 3.

So the question is will this cause any problems?And is there any other way to skip exactly 3 times while executing a step concurrently?

Ming
  • 1
  • Is your question similar to https://stackoverflow.com/questions/51962185/spring-batch-with-throttle-limit-and-skip-limit ? – Mahmoud Ben Hassine Jun 22 '21 at 10:31
  • Is your item reader/writer thread-safe? Can you share their bean definitions? The `FlatFileItemReader` is not thread-safe and is typically wrapped in a `SynchronizedItemStreamItemReader` when used in a multi-threaded step. Have you tried that? – Mahmoud Ben Hassine Jun 24 '21 at 10:18

1 Answers1

0

github issue

Robert Kasanicky opened BATCH-926 and commented

When chunks execute concurrently each chunk works with its own local copy of skipCount (from StepContribution). Given skipLimit=1 and 10 chunks execute concurrenlty we can end up with successful job execution and 10 skips. As the number of concurrent chunks increases the skipLimit becomes more a limit per chunk rather than job.

Dave Syer commented

I vote for documenting the current behaviour.

Robert Kasanicky commented

documented current behavior in the factory bean

However, this seems to be a correct thing for a very old version of spring batch. I have a different problem in my code, but the skipLimit seems to be aggregated when I use multiple threads. Albeit the job sometimes hangs without properly failing when SkipLimitException is thrown.

Vadim Kirilchuk
  • 3,532
  • 4
  • 32
  • 49