0

We are trying to develop production ready application using spring batch and spring cloud data flow server. We came across a issue when users will submit multiple files to process to scdf server at a time lets take example of 50 files each has 100 records to process. As SCDF is deployed in K8 area , it will deploy one pod for one file. K8 namespace has cpu.limit= 10 and memory limit=20GB. Lets say we have configured 1 Pod will take cpu.limit=500m and memory.limit = 1GB

So, At max only 10 pods will run at one time and for other request SCDF server will not able to process the requests.

What are the ways we have to avoid this failure? Is there any queuing mechanism in SCDF to queue the request and process them later or we must have some frontend component to handle multiple request and send file request to scdf server only when it is able to spawn other pods ?

I have set the maxcurrentTask properties to limit the number of pods launch by SCDF server. spring.cloud.dataflow.task.platform.kubernetes.accounts.default.maximum-concurrent-tasks=100

But this will not help to process other file request.

  • Hi @mukul-verma, "users will submit multiple files to process to scdf server at a time " - can you tell me more about how they submitting the files, what mechanism? – onobc Mar 24 '23 at 12:59
  • That does not seem to be related to SCDF per se. Even without SCDF, If you submit more jobs than what your k8s cluster can handle, some jobs will fail to be submitted. You need to use another pattern, like a work queue: https://kubernetes.io/docs/tasks/job/fine-parallel-processing-work-queue/ – Mahmoud Ben Hassine Mar 24 '23 at 14:19
  • @onobc file will be submitted by user through api call using unique key to find that file by passing that unique key in the arguments and delegate the task launch activity to SCDF server. – mukul verma Mar 28 '23 at 05:39
  • @MahmoudBenHassine thanks we will look into this pattern but only question is that SCDF has this responsibility to delegate one file to one pod through internal mechanism and the way you had suggested would work without SCDF only? – mukul verma Mar 28 '23 at 06:26
  • No, SCDF does not have any internal mechanism to distribute work as you described. It is up to you to design your solution according to the deployment pattern that fits your need. – Mahmoud Ben Hassine Mar 28 '23 at 09:18

0 Answers0