1

Resolved by defining Throttle limit: The same is defined here Spring batch multithreading: throttle-limit impact

I’m using Spring boot with Spring batch and having below TaskExecutor configuration.

@Bean
public ThreadPoolTaskExecutor getJobTaskExecutor() {
    ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
    taskExecutor.setCorePoolSize(10);
    taskExecutor.setMaxPoolSize(10);
    taskExecutor.setQueueCapacity(0);
    taskExecutor.setThreadNamePrefix("ProcessJob-");
    taskExecutor.afterPropertiesSet();
    taskExecutor.initialize();
    return taskExecutor;
}

I noticed when I run batch with 20K records, some of the threads have started processing but it stops after 10 requests. However, other threads are processing properly. Can you please suggest what could be the issue? If I keep CorePoolSize=ThreadPoolSize=5 then all the threads are distributed properly.

CorePoolSize=MaxPoolSize=10 (Threads are not distributed properly)
Thread Name           Count
---------------------------
Thread-ProcessJob-1     10
Thread-ProcessJob-10    4200
Thread-ProcessJob-2     10
Thread-ProcessJob-3     10
Thread-ProcessJob-4     10
Thread-ProcessJob-5     1290
Thread-ProcessJob-6     10
Thread-ProcessJob-7     4980
Thread-ProcessJob-8     4479
Thread-ProcessJob-9     4999

CorePoolSize=MaxPoolSize=5 (Threads are distributed properly)
Thread-ProcessJob-1     1199
Thread-ProcessJob-2     1201
Thread-ProcessJob-3     1214
Thread-ProcessJob-4     1211
Thread-ProcessJob-5     1209
Prashant S
  • 349
  • 4
  • 14
  • 1
    I don't think Spring Batch or Spring Boot controls how work is distributed by the JVM across threads from the pool. This should vary between executions. There is a parameter that might play a role here which is the step's `throttleLimit` which defaults to 4. Have you tried to increase it when you use CorePoolSize=MaxPoolSize=10 ? – Mahmoud Ben Hassine Dec 09 '20 at 10:44
  • Thank you Mahmoud. Apparantly when I removed `taskExecutor.setQueueCapacity(0)` so that TaskExecutor could consider default value which is infinite tasks so the issue got resolved but the performance got degraded drastically. Let me try with `throttleLimit` and verify. Many thanks. – Prashant S Dec 10 '20 at 02:22
  • 1
    I saw this `taskExecutor.setQueueCapacity(0)` and was wondering why you do that but I forgot to mention it in my previous comment. There are multiple factors here: the thread pool core/max size, the step's chunkSize and throttleLimit, the taskExecutor queue size, etc. All these can play a role in the observed phenomenon and there is no recipe to find the best combination, you need to proceed in an empirical way. This is an interesting issue and I would be curious about your findings. – Mahmoud Ben Hassine Dec 10 '20 at 09:11
  • @MahmoudBenHassine After adding Throttle limit to the MaxPoolsize, able to use all the threads and also underlying WS call we see traffic. Performance is still not scaling up. After getting response from WS I write that into multiple DB using ItemWriter. I'm using `@Transactional` annotation for this and read somewhere it should not be used rather Spring batch should use its own transaction manager but wondering if this is causing any perf issues. – Prashant S Dec 14 '20 at 15:40
  • @Override @Transactional(value="transactionManager",propagation = Propagation.REQUIRED) public void write(List extends MyBO> myBOList) { – Prashant S Dec 14 '20 at 15:59
  • 1
    The performance and transaction issue is a different problem, you can open a different question for that. `After adding Throttle limit to the MaxPoolsize, able to use all the threads and also underlying WS call we see traffic.`: I assume based on this that you fixed your issue by updating the throttle limit as I suggested. Is that correct? If this is correct, would you accept an answer if I add one with that content? – Mahmoud Ben Hassine Dec 15 '20 at 14:31
  • yes two things resolved the problem. #1 is to remove setting queueCapacity explicitly and #2 add throttle limit – Prashant S Dec 15 '20 at 15:48
  • ok, as I said previously, I forgot to mention the queue capacity in my first comment. I added an answer, if it helped getting you on the right direction, please accept it. – Mahmoud Ben Hassine Dec 15 '20 at 18:28

1 Answers1

2

You are setting the task executor's QueueCapacity to 0, which could be the cause of your issue. There is also a parameter that might play a role here which is the step's throttleLimit that defaults to 4. You should try to increase it and find the best value for your use case.

There are multiple factors here: the thread pool core/max size, the step's chunkSize and throttleLimit, the taskExecutor queue size, etc. All these can play a role in the observed phenomenon and there is no recipe to find the best combination, you need to proceed in an empirical way.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50