I have thousands of files and folders with the same structure in a main directory path. I am trying to improve the processing time of a R script by automating several LSF batch jobs using arguments. I have some experience with R but no experience with batch file and other language associated.
Description of the structure of my folders:
In the main directory, I have several folders that are named based on the year and each of these folders contain other folders based on the name of a model. The folders that are named based on model contained the files I compute.
year model
----- A
|----1980|
| ----- B
main directory path -------|
| ----- A
|----1981|
----- B
R script
I have a R script where I have two parameters "year" and "model" that will be my arguments. I create a reproducible example:
library(raster)
#Creation of raster files that that allow to create files. Here, I only create two rasters that should be place in one folder model.
r1= raster(nrows = 1, ncols = 1, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 0.3)
r2= raster(nrows = 1, ncols = 1, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 0.1)
which_year=arg[1]
years<- c("1980", "1981")
year<- years(which_year)
which_model=arg[2]
models<- c("A" ,"B")
model<- models(which_model)
setwd(file.path("directory_path",year, model)
pr<- stack(r1, r2)
mean<- mean(pr)
writeRaster(mean, file.path("directory_path",year, model_name, 'stackr1r2.tif'))
Batch code
My concern is to implement a loop in the batch script. I would like to automate batch job submission through a loop over years and models that will allow me to go the adequate directory. However, I have no ideas how to do that.
I know that the structure should be something like that:
do from model = 1, nmodel
do from year =1, nyear
bsub -n 1 -W 3600 "Rscript mycode.R year model"
enddo
enddo
Thanks for any help and I hope I was clear in my explanations as I do not have the technical vocabulary!