3

I am using slurm scripts to run arrays for Matlab computing on a cluster. Each script uses an array to loop over a matlab parameter.

1) Is it possible to create a shell script to loop over another variable?
2) Can I pass variables to a slurm script?

For example, my slurm files currently look like

#!/bin/bash
#SBATCH --array=1-128
...
matlab -nodesktop r "frame=[${SLURM_ARRAY_TASK_ID}]; filename=['Person24']; myfunction(frame, filename);";

I frequently need to run this array to process a number of different files. This means I will submit the job (sbatch exampleScript.slurm), edit the file, update 'Person24' to 'Person25', and then resubmit the job. This is pretty inefficient when I have a large number of files to process.

Could I make a shell script that would pass a variable to the slurm script? For example, something like this:

Shell Script (myshell.sh)

#!/bin/bash
for ((FNUM=24; FNUM<=30; FNUM+=1));
do
     sbatch myscript.slurm  >> SOMEHOW PASS ${FNUM} HERE (?)
done 

Slurm script (myscript.slurm)

#!/bin/bash
#SBATCH --array=1-128
...
matlab -nodesktop -nodisplay r "frame=[${SLURM_ARRAY_TASK_ID}]; filename=[${FNUM}]; myfunction(frame, filename);";

where I could efficiently submit all of the jobs using something like sbatch myshell.sh

Thank you!

Katie Ozgun
  • 53
  • 1
  • 4

1 Answers1

6

In order to avoid possible name collisions with shell and anvironment variables, it is a good habit to always use lowercase or mixed case variables in your Bash scripts.

You were almost there. You just need to pass the variable as an argument to the second script and then pick it up there based on the positional parameters. In this case, it looks like you're only passing one argument, so $1 is OK to use. In other cases, with multiple parameters of a fixed number you could also use $2,$3, etc. With a variable number of arguments "$@" would be more appropriate.

Shell Script (myshell.sh)

#!/bin/bash
for ((fnum=24; fnum<=30; fnum+=1))
do
     sbatch myscript.slurm "$fnum"
done 

Slurm script (myscript.slurm)

#!/bin/bash
#SBATCH --array=1-128

fnum=$1

...
matlab -nodesktop -nodisplay r "frame=[${slurm_array_task_ID}]; filename=[${fnum}]; myfunction(frame, filename);";

For handling various timeout conditions this might work:

A=$(sbatch --parsable a.slurm)

case $? in
    9|64|130|131|137|140)
        echo "some sort of timeout occurred"
        B=$(sbatch --parsable --dependency=afternotok:$A a.slurm)
        ;;
    *)
        echo "some other exit condition occurred"
        ;;
esac

You will just need to decide what conditions you want to handle and how you want to handle them. I have listed all the ones that seem to involve timeouts.

Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • Thank you! Another related question -- I know I can add a dependency to the pipeline as well using the coding format 'A=$(sbatch --parsable a.slurm) B=$(sbatch --parsable --dependency=afternotok:$A a.slurm) '. This will run B if A fails. I occasionally have a time-out problem when I run these scripts. Do you know if there is a way to check if the exit code of A is specifically failure due to timeout (not other failure), in which case then call B? – Katie Ozgun Jun 19 '19 at 16:16
  • @KatieOzgun: I don't really know anything about slurm, but I added something to my answer which might be helpful. You should probably consider asking a new separate question. – Dennis Williamson Jun 19 '19 at 17:19