0

I want to execute a SLURM script from within multiple directories simultaneously. More specifically, I have ten array folders numbered array_1 through array_10 from which I want to execute the script. Within each of these directories, the script creates 10 subdirectories, labelled ${SLURM_ARRAY_TASK_ID}_ztag. However, I have to manually execute the SLURM script from within each of the ten array_ directories individually. This becomes cumbersome when I have to do this over and over again.

Normally, with a shell script, this would be simple for loop, but because #SBATCH isn't interpreted by bash, I haven't had any success. The current script (which is run within each array folder individually) is:

#!/bin/bash
#SBATCH -o <some_thing>.o%j 
#SBATCH --time=<time> #specify the time
#SBATCH --array=1-10 #ten arrays

#SBATCH -c 1
#SBATCH -C dodeca96gb
#SBATCH --mem=<memory>

echo "SLURM_JOBID: " $SLURM_JOBID
echo "SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
echo "SLURM_ARRAY_JOB_ID: " $SLURM_ARRAY_JOB_ID

mkdir ${SLURM_ARRAY_TASK_ID}_ztagA #creates 10 subdirs w/i ea. array
cd ${SLURM_ARRAY_TASK_ID}_ztagA

$ROSETTA3BIN/bin/rna_denovo.default.linuxgccrelease -s ./<dir>/*pdb -nstruct 100 -fasta ./<fastafile>.fasta -secstruct_file ./<dot-brackets>.secstruct

I then type sbatch <filename>.slurm and the script creates the subdirectories from within whatever directory the script is executed from, hence the need for the cd line, so getting this to execute from within all ten arrays simultaneously has been tricky. I have tried the following in various combinations:

#!/bin/bash
#SBATCH --array=1-10
#SBATCH --chdir=./array_%a
#SBATCH -o ./array_%a/<some_thing>.o%j #STDOUT
#SBATCH --time=<time>

#SBATCH -c 1
#SBATCH -C dodeca96gb
#SBATCH --mem=<memory>

echo "SLURM_JOBID: " $SLURM_JOBID
echo "SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
echo "SLURM_ARRAY_JOB_ID: " $SLURM_ARRAY_JOB_ID

for i in {1..10}
do
    mkdir -p ./array_${i}/${SLURM_ARRAY_TASK_ID}_ztagA
    cd ./array_${i}/${SLURM_ARRAY_TASK_ID}_ztagA

$ROSETTA3BIN/bin/rna_denovo.default.linuxgccrelease -s ./<dir>/*pdb -nstruct 100 -fasta ./<fastafile>.fasta -secstruct_file ./<dot-brackets>.secstruct

wait
done

I've tried putting the for loop arguments before/after various lines, including the wait and done, but I get an error saying it can't open the fasta, secstruct, and or ./dir. I've also tried creating 10 arrays first (which is easy) and then doing:

#!/bin/bash
for i in {1..10}
do
    sbatch ./array_{i}/<filename>.slurm
wait
done

But this doesn't put the output files or the subdirectories into the array folders; it either leaves them in the parent.

Any suggestions?

  • Your first script should be the correct way to do it. What exact problem do you face when submitting it? If you use the array functionality, you do not need any `for` loop anywhere. Your submission script will be executed 10 times indpendently – damienfrancois Jun 03 '20 at 11:19
  • The problem I face is that, if I try to submit the SLURM script from the parent folder containing the array_1 - array_10 folders, the .out files don't end up sorted in the array_# folders, but rather all stuffed in the parent folder from which I execute it. – willford97 Jun 04 '20 at 14:09
  • is the `.out` file written by `rna_denovo` or do you mean the output file of Slurm? – damienfrancois Jun 04 '20 at 14:57
  • I'm actually referring to both; neither output is sorted correctly. However, I think I actually figured it out. I'll post the solutionls – willford97 Jun 05 '20 at 19:36

1 Answers1

0

After several days of on-and-off trying, I figured it out. I wrote a shell script that executes the SLURM within each directory, rather than trying to edit the SLURM script itself.

#!/bin/bash

for i in {1..10}
do
    cd array_${i}
    sbatch ./<name_of_slurm_script>.slurm
    cd ../
wait
done