2

I do the following to send a bunch of models to a compute server.

future waits for the first call to wrap, before the next is sent. How do I tell future that it can send multiple jobs to the remote at the same time?

This is clearly possible as I can send multiple jobs to the same remote from different local R sessions, or if I call plan(login) again between calls. But how do I specify the topology so future doesn't wait and I don't have to repetitively call plan?

library(future) 
login <- tweak(remote, workers = "me@localcomputeserver.de")
plan(list(login))
bla %<-% { bla <- rnorm(1000); Sys.sleep(100); saveRDS(bla, file="bla.rds"); bla}
bla2 %<-% { bla2 <- rnorm(1000); Sys.sleep(100); saveRDS(bla2, file="bla2.rds"); bla2 }
HenrikB
  • 6,132
  • 31
  • 34
Ruben
  • 3,452
  • 31
  • 47

1 Answers1

1

Author of future here: If you're happy with separate R processes on your remote machine, you can use:

library("future")
remote_machine <- "me@localcomputeserver.de"
plan(cluster, workers = rep(remote_machine, times = 2L))

to get two remote workers on the same machine. That way you can have two active futures at the same time without blocking.

FYI, plan(remote, ...) is basically just plan(cluster, persistent = TRUE, ...), where "persistent" means that the R variables survive on the worker across multiple future calls; you rarely want to do that - so use cluster instead.

HenrikB
  • 6,132
  • 31
  • 34
  • Sorry, I accepted before checking. `remote` and `cluster` differ by more than one argument. If I don't tweak `cluster` to have `homogeneous = FALSE`, I get an error: "bash: /Library/Frameworks/R.framework/Resources/bin/Rscript: No such file or directory". So, using remote is a bit easier to work with by default. Thanks for the hint about replicating the workers argument. Would be nice to add this to the docs. – Ruben Apr 16 '18 at 09:59