1

I followed this instruction that uses makeCluster function but it seems it doesn't work for windows

primary <- '192.168.1.235'
machineAddresses <- list(
list(host=primary,user='johnmount',
   ncore=4),
list(host='192.168.1.70',user='johnmount',
   ncore=4)
)

spec <- lapply(machineAddresses,
           function(machine) {
             rep(list(list(host=machine$host,
                           user=machine$user)),
                 machine$ncore)
           })
 spec <- unlist(spec,recursive=FALSE)

parallelCluster <- parallel::makeCluster(type='PSOCK',
                                     master=primary,
                                     spec=spec)
print(parallelCluster)
MFR
  • 2,049
  • 3
  • 29
  • 53
  • For detailed commands , refer here https://stackoverflow.com/questions/44912893/running-parallel-r-on-multiple-hosts/44912894#44912894 – niths4u Jul 27 '17 at 09:57

1 Answers1

5

Do the following:

Collect a list of addresses of machines you can ssh. This is the hard part, depends on your operating system, and something you should get help with if you have not tried it before. In this case I am using ipV4 addresses, but when using Amazon EC2 I use hostnames.

In my case my list is:

My machine (primary): “192.168.1.235”, user “rajeevkumar” Another Win-Vector LLC machine: “192.168.1.70”, user “rajeevkumar”

Notice we are not collecting passwords, as we are assuming we have set up proper “authorized_keys” and keypairs in the “.ssh” configurations of all of these machines. We are calling the machine we are using to issue the overall computation “primary.”

It is vital you try all of these addresses with “ssh” in a terminal shell before trying them with R.

Now with the system stuff behind us the R part is as follows. Start your cluster with:

 primary <- '192.168.1.235'
machineAddresses <- list(
  list(host=primary,user='johnmount',
       ncore=4),
  list(host='192.168.1.70',user='johnmount',
       ncore=4)
)

spec <- lapply(machineAddresses,
               function(machine) {
                 rep(list(list(host=machine$host,
                               user=machine$user)),
                     machine$ncore)
               })
spec <- unlist(spec,recursive=FALSE)

parallelCluster <- parallel::makeCluster(type='PSOCK',
                                         master=primary,
                                         spec=spec)
print(parallelCluster)
## socket cluster with 8 nodes on hosts
##                   ‘192.168.1.235’, ‘192.168.1.70’

And that is it. You can now run your job on many cores on many machines.

You can read more at:

http://www.r-bloggers.com/running-r-jobs-quickly-on-many-machines/

Rajeev Barnwal
  • 1,349
  • 11
  • 14