0

I'm trying to have SGE run job array tasks concurrently based on the job shares parameter of qsub but it seems not to be working as expected. Is there a way to enable concurrent task execution based on shares?

I have a script which sleeps to simulate long running tasks and I submit it to a small SGE cluster (26 slots) as different job arrays as follows:

qsub -t 1-201 -js 100 sge_longRunning.sh
qsub -t 1-202 -js 101 sge_longRunning.sh
qsub -t 1-203 -js 102 sge_longRunning.sh

I would expect the tasks to be almost equally distributed on the cluster over time but what I get is that the last submitted array gets fully executed (all 203 tasks), then the second one gets completely executed, finally the first one.

The cluster operates under a functional policy with 1M tickets and 0.9 weight for functional policy tickets.

Any hints how to get the tasks for the different job arrays to run concurrently sharing almost equally the available resources? Any hint what might be wrong with the above configuration/test setting?

a1an
  • 447
  • 2
  • 7
  • 17

1 Answers1

0

About the only practical way would be to submit the jobs as different user or project with its own share.

If that isn't practical then try submitting as one big array job which chooses which work to do based on a queue maintained by your script in whatever order you like.

William Hay
  • 376
  • 1
  • 7
  • So do you confirm that the -js parameter of qsub is not working as expected? I thought it could also be a matter of too short / sleeping tasks but I could not develop a real workload test for it... – a1an Jul 05 '19 at 12:39
  • It tweaks the share of your tickets assigned to each job not the share of the cluster assigned to each job. – William Hay Jul 07 '19 at 13:14
  • Since my user is the only one using the cluster and there are no projects I would then expect that tweaking the share of my tickets and the share of the cluster would produce the same result, namely parallel and not sequential job/task processing. – a1an Jul 08 '19 at 15:04