I am trying to combine the parallelizing feature of plyr
to call a python function via reticulate
but there seems to be an identical seed used on the different instances.
on python:
# This is called python_script.py
import random
def give_a_rand():
return(random.random())
on R
library(reticulate)
library(plyr)
library(doMC)
doMC::registerDoMC(cores=10)
reticulate::source_python('/path/to/python_script.py')
After loading libraries, registering cores for plyr
and linking the python script to my R session via reticulate
we can now call the python function give_a_rand()
natively on R
> give_a_rand()
[1] 0.896585
We can use plyr to run it many times without parallelizing it:
> aaply(.data=1:10, .margins=1, .fun=function(x){give_a_rand()}, .parallel=F)
1 2 3 4 5 6
0.183420430 0.539790166 0.817348174 0.130959177 0.143210990 0.794048321
7 8 9 10
0.276724929 0.820918953 0.003462523 0.903942433
I guess that at some point I need to force the seed for the randomization engine in such a way that every instance has a different one. All is great so far ... but how to parallelize it?
aaply(.data=1:10, .margins=1, .fun=function(x){give_a_rand()}, .parallel=T)
1 2 3 4 5 6 7 8
0.896585 0.896585 0.896585 0.896585 0.896585 0.896585 0.896585 0.896585
9 10
0.896585 0.896585