0

Our use case -> Using Remote partitioning - the job is devided into multiple partitions and using active MQ workers are processing these partitions.

Job is failing with memory issue at MessageChannelPartitionHandler handle method where it is holding more number of StepExecution in memory.(we have around 20K StepExecutions/partitions in this case)

we override message channel partition handler for submitting controlled messages to ActiveMQ and even when we try to poll replies from DB it is having database connection timeout issues and when we increased idle connection this approach as well failing to hold all those StepExecutions in memory.

Either case of our Custom/MessageChannelPartitionHandler we are facing similar issues and these step executions are required to aggregate at master. Do we have any alternative way of achieving this.

Can someone help us to understand better way of handling these long running/huge data processing scenarios?

  • 20k partitions seems a lot to me, your partitioning criteria is probably not the best one. Can you describe your requirement in terms of input/output without referring to Spring Batch? – Mahmoud Ben Hassine Jun 24 '21 at 09:59
  • Hi @MahmoudBenHassine , thanks for reply, our requirement is letter to be generated 10 million members . 10M members divided as 20k partitions of 500 each. 4 workers processing 50 partitions at a time. we need to supply these 500 member details to legacy application as part of processing which populates some staging tables based on business logic, then writer will read information from staging tables and prepare a CSV with all members details. The legacy code needs a DB connection object to be passed in hence we designed this way of Remote partitioning. – Krishna Kishore Jun 27 '21 at 05:25
  • You are still referring to Spring Batch and to your attempted solution, see https://xyproblem.info. `then writer will read information`: this does not make sense to me. Apart from your goal of generating a letter (from a template?) to 10M members, I do not understand the flow of data. What is an item in your case? If an item is a member and you need to create a letter for each member, then probably the chunk processing model is not suited to your case, as it will write items in chunks (unless you want to set the chunkSize to 1 which will hurt the performance of your job). – Mahmoud Ben Hassine Jun 28 '21 at 10:28

0 Answers0