I've got access to a big, powerful cluster. I'm a halfway decent R programmer, but totally new to shell commands (and terminal commands in general besides basic things that one needs to do to use ubuntu).
I want to use this cluster to run a bunch of parallel processes in R, and then I want to combine them. Specifically, I have a problem analogous to:
my.function <-function(data,otherdata,N){
mod = lm(y~x, data=data)
a = predict(mod,newdata = otherdata,se.fit=TRUE)
b = rnorm(N,a$fit,a$se.fit)
b
}
r1 = my.function
r2 = my.function
r3 = my.function
r4 = my.function
...
r1000 = my.function
results = list(r1,r2,r3,r4, ... r1000)
The above is just a dumb example, but basically I want to do something 1000 times in parallel, and then do something with all of the results from the 1000 processes.
How do I submit 1000 jobs simultaneously to the cluster, and then combine all the results, like in the last line of the code?
Any recommendations for well-written manuals/references for me to go RTFM with would be welcome as well. Unfortunately, the documents that I've found aren't particularly intelligible.
Thanks in advance!