2

I'm using reticulate in RMarkdown and am trying to run a locally defined Python function with parallel processing. I've looked around and this answer is the closest I've found to solving my issue except the function I'm using is not defined in a separate Python script, but rather within Rmarkdown. Below is a simplified example using llply, which gives me the error Error in unserialize(socklist[[n]]) : error reading from connection.

I have also tried foreach(), which doesn't recognize the py$ object even with reticulate::py$function.

I have also tried mclapply and pbmcapply, which appear to run and engage all cores, but they keep hanging and won't finish.

```{r}
    library(reticulate)
    library(doParallel)
    library(foreach)
    library(plyr)
```

```{python}
def myFn1(x):
    return(sqrt(x))
```

```{r}
cl <- makeCluster(detectCores())
registerDoParallel(cl)
llply(list(2, 3, 4), .fun=reticulate::py$myFn1, .parallel=TRUE)
stopCluster(cl)
```

I am not very knowledgeable about reticulate or parallel processing and I would be really grateful for any help.

vzste
  • 119
  • 9

3 Answers3

3

I am quite sure that reticulate can not be run parallel, at least not when setting it up in one R process and then trying to reuse it in another (in the way you do it here). The reason is that reticulate creates objects that can not be exported to other processes. I have an example of this in Section 'Package: reticulate' in https://cran.r-project.org/web/packages/future/vignettes/future-4-non-exportable-objects.html.

A possible workaround is to set up a separate reticulate instance for each parallel workers.

HenrikB
  • 6,132
  • 31
  • 34
  • 4
    Interesting, that makes a lot of sense. Could you offer any guidance on setting up the possible workaround you suggest? Know of any examples I could follow? Thanks for your help. – vzste Oct 22 '19 at 20:22
1

reticulate imports seem to be working to with the future package and plan(multicore). I tried with plan(multisession), which failed.

Till
  • 3,845
  • 1
  • 11
  • 18
0

I was able to get this to work with the doRNG package running on a multicore server environment. The key seems to be not load reticulate until you have forked the process. I put together some hypothetical code below:

library(doRNG)

result <- foreach(i = 1:nsims) %dorng% {
  reticulate::virtualenv_create("test")
  reticulate::use_virtualenv("test", require = TRUE)
  reticulate::py_config()
  output <- NULL
  
  # rest of code ...

  return(output)

}

Eifer
  • 56
  • 5