I have a sbatch script to submit job arrays to Slurm with different steps:
#!/bin/bash
#SBATCH --ntasks 1
#SBATCH --nodes 1
#SBATCH --time 00-01:00:00
#SBATCH --array=0-15
dir="TEST_$SLURM_ARRAY_JOB_ID"
org=base-case
dst=$dir/case-$SLURM_ARRAY_TASK_ID
#step 0 -> I'd like that this step was executed only by one task!
srun mkdir $dir
#step 1
srun cp -r $org $dst
#step 2
srun python createParamsFile.py $dst $SLURM_ARRAY_TASK_ID
#step 3
srun python simulation.py $dst
I'd like to run step 0 just once, since the rest of the jobs in the array will share the directory created. It is not a big deal, because once the directory is created the remaining attempts raise an error on creating the directory. But it is always better to avoid error messages in the logs and slurm steps abortions Per example in this case:
/usr/bin/mkdir: cannot create directory 'TEST_111224': File exists
srun: error: s02r3b83: task 0: Exited with exit code 1
srun: Terminating job step 111226.0
It is true that if I execute the mkdir command without the srun, step 0 does not exist and it is not terminated abruptly. But I still get the error.