I have four 32 cores linux servers (CentOS 7) that I would like to utilize for a parallelized computation in R
So far I have been only using doMC packages and registerDoMC(cores=32) to utilize the multicore capabilities of a single server. I would like to expand this to all four servers (i.e. 128=32x4, if possible)
I have done some searching online, seems like there are a bunch of choices: PSOCK, MPI, SNOW, SparkR, etc. Nonetheless, I could not get it work with any suggestion online.
I am aware there are some prerequisites, here is what I have done so far: 1) All servers are all "connected", ie. can SSH to each other with no-password login 2) NFS mounted so all servers can all access (read, write and execute access) 3) All servers run on the the same R binaries (under anaconda build on a shared locations which all servers can executed) 4) Installed openmpi, Rmpi, snow, doSNOW, Spark, SparkR (although I don't know how to use it)
Can another give some advise what I can do next?
Thanks a lot