Trying this code
library(future)
library(foreach)
ncores <- 3
cl <- parallel::makeCluster(3)
avail <- bigstatsr::FBM(ncores, 1, type = "integer", init = 1)
doFuture::registerDoFuture()
res <- vector("list", 5)
for (i in seq_along(res)) {
while (sum(avail[]) == 0) {
cat("Waiting..\n")
Sys.sleep(0.5)
}
ind.avail <- which(avail[] == 1)
cat("Available:", length(ind.avail), "\n")
plan(cluster, workers = cl[ind.avail])
foo <- foreach(i = 3:1) %dopar% {
Sys.sleep(i)
}
print(one <- ind.avail[1])
avail[one] <- 0; print(avail[])
res[[i]] <- cluster(workers = cl[one], {
Sys.sleep(5)
avail[one] <- 1
i
})
}
sapply(res, resolved)
parallel::stopCluster(cl)
Error I get: Initialization of plan() failed, because the test future used for validation failed. The reason was: Unexpected result (of class ‘NULL’ != ‘FutureResult’) retrieved for ClusterFuture future (label = ‘<none>’, expression = ‘NA’)
.
Explanation of my example trying to reproduce my real problem:
- I loop many times (here 5) over two steps
- the first step is easily parallelized with foreach
- the second step is not easily parallelized and depends on the first step
So my idea was to parallelize the first step over all available clusters and to run the second step asynchronously using one cluster only. This cluster would not be available anymore until this asynchronous job is finished. Then the next first step would have one less cluster available and so on. When there is no available cluster anymore for the first step, it would wait for some asynchronous job to finish and to release some cluster.