2

I have 12 folders that all contains one R file with the same name (e.g., file.R). The name of the folders are m1, m2, ..., m12. To run each file, I run bsub -n 2 -q long -W 12:00 -R "rusage[mem=25000]" -M 25000 -hl R CMD BATCH file.R in each folder. Is there a way to run it as a job array in the LSF submission system using bsub? Thank you.

Andrew
  • 678
  • 2
  • 9
  • 19

1 Answers1

2

Is there a way to run it as a job array in the LSF submission system using bsub?

Yes there is. Use -J "[1-12]" to run an array job. The job will run 12 instances. You'll need to write a simple script to set the CWD properly and then start R. Something like this should work. The instance number is available through the environment variable $LSB_JOBINDEX.

$ cat runjob.sh 
#!/bin/sh

cd m${LSB_JOBINDEX}
exec R CMD BATCH file.R

and then submit your job like this

$ bsub -n 2 -q long -W 12:00 -R "rusage[mem=25000]" -M 25000 -hl -J "[1-12]" sh runjob.sh 
Job <1164> is submitted to queue <long>.
Michael Closson
  • 902
  • 8
  • 13
  • If I had two files in each folder, say `file.R` and `poo.R`, can I still just run them from the same `runjob.sh` script by simply adding another line with `exec R CMD BATCH poo.R`? Thank you! – Andrew Jul 27 '19 at 11:45
  • 1
    Yes. But remove the `exec`. exec will replace the current process image with `R CMD BATCH file.R`. So therefore none of the command after the first exec will be executed. By removing the exec, a new process will be created to run `R CMD BATCH file.R`. In your case file.R and poo.R will run sequentially. – Michael Closson Jul 27 '19 at 17:50
  • How to do that in parallel instead of sequentially? – Andrew Jul 28 '19 at 13:48
  • 1
    You could either run poo.R in a separate job array or run it in the same job array. If using the latter, you'll need some way to map the array index to poo.R or file.R – Michael Closson Jul 29 '19 at 14:42