I want to run a loop in parallel from my local machine, but connect it to remote machines that sit on an HPC system. Here is my approach:
library("future.batchtools")
library("future.apply")
library("future")
f <- function(x) {
paste0("x = ",x , ": PID ", Sys.getpid(), " @ ", Sys.info()[["nodename"]])
}
options(parallelly.debug = TRUE)
plan(batchtools_slurm, workers = 2, nodename = "login.example.com")
future_lapply(1:4,f)
where login.example.com
is the address of the HPC login node I can ssh
into to submit Slurm jobs. The above code gives an error
Error: Failed to submit BatchtoolsSlurmFuture (future_lapply-1). The reason was: error in running command
TROUBLESHOOTING INFORMATION:
batchtools::submitJobs() was called with the following 'resources' argument:
list()
What works nicely is
library("future.apply")
library("future")
f <- function(x) {
paste0("x = ",x, ": PID ", Sys.getpid(), " @ ", Sys.info()[["nodename"]])
}
cl <- makeClusterPSOCK(workers = rep("login.example.com",2))
plan(cluster, workers = cl)
future_lapply(1:4,f)
which gives
[[1]]
[1] "x = 1: PID 68779 @ login.example.com"
[[2]]
[1] "x = 2: PID 68779 @ login.example.com"
[[3]]
[1] "x = 3: PID 15327 @ login.example.com"
[[4]]
[1] "x = 4: PID 15327 @ login.example.com"
How can I make the above code work, i.e. to use the Slurm scheduler on 'login.example.com'?