25

I need to do some simulations and for debugging purposes I want to use set.seed to get the same result. Here is the example of what I am trying to do:

library(foreach)
library(doMC)
registerDoMC(2)

set.seed(123)
a <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
set.seed(123)
b <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}

Objects a and b should be identical, i.e. sum(abs(a-b)) should be zero, but this is not the case. I am doing something wrong, or have I stumbled on to some feature?

I am able to reproduce this on two different systems with R 2.13 and R 2.14

Ferdi
  • 540
  • 3
  • 12
  • 23
mpiktas
  • 11,258
  • 7
  • 44
  • 57

4 Answers4

20

My default answer used to be "well then don't do that" (using foreach) as the snow package does this (reliably!) for you.

But as @Spacedman points out, Renaud's new doRNG is what you are looking for if you want to remain with the doFoo / foreach family.

The real key though is a clusterApply-style call to get the seeds set on all nodes. And in a fashion that coordinated across streams. Oh, and did I mention that snow by Tierney, Rossini, Li and Sevcikova has been doing this for you for almost a decade?

Edit: And while you didn't ask about snow, for completeness here is an example from the command-line:

edd@max:~$ r -lsnow -e'cl <- makeSOCKcluster(c("localhost","localhost"));\
         clusterSetupRNG(cl);\
         print(do.call("rbind", clusterApply(cl, 1:4, \
                                             function(x) { stats::rnorm(1) } )))'
Loading required package: utils
Loading required package: utils
Loading required package: rlecuyer
           [,1]
[1,] -1.1406340
[2,]  0.7049582
[3,] -0.4981589
[4,]  0.4821092
edd@max:~$ r -lsnow -e'cl <- makeSOCKcluster(c("localhost","localhost"));\
         clusterSetupRNG(cl);\
         print(do.call("rbind", clusterApply(cl, 1:4, \
                                             function(x) { stats::rnorm(1) } )))'
Loading required package: utils
Loading required package: utils
Loading required package: rlecuyer
           [,1]
[1,] -1.1406340
[2,]  0.7049582
[3,] -0.4981589
[4,]  0.4821092
edd@max:~$ 

Edit: And for completeness, here is your example combined with what is in the docs for doRNG

> library(foreach)
R> library(doMC)
Loading required package: multicore

Attaching package: ‘multicore’

The following object(s) are masked from ‘package:parallel’:

    mclapply, mcparallel, pvec

R> registerDoMC(2)
R> library(doRNG)
R> set.seed(123)
R> a <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
R> set.seed(123)
R> b <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}
R> identical(a,b)
[1] FALSE                     ## ie standard approach not reproducible
R>
R> seed <- doRNGseed()
R> a <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> b <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> doRNGseed(seed)
R> a1 <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> b1 <- foreach(i=1:2,combine=cbind) %dorng% { rnorm(5) }
R> identical(a,a1) && identical(b,b1)
[1] TRUE                      ## all is well now with doRNGseed()
R> 
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • Thanks for example with snow. I am not well versed in intricacies of parallel programming in R, so I started using `foreach` for its painless transition from non-parallel code to parallel. I knew that I was missing something. – mpiktas Dec 02 '11 at 18:14
  • 2
    Well, that's why we all started years ago with snow as the transition from the standard *apply() functions to parallel ones was easy :) – Dirk Eddelbuettel Dec 02 '11 at 18:41
8

Using set.seed(123, kind = "L'Ecuyer-CMRG") also does the trick and does not require an extra package:

set.seed(123, kind = "L'Ecuyer-CMRG")
a <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}

set.seed(123, kind = "L'Ecuyer-CMRG")
b <- foreach(i=1:2,.combine=cbind) %dopar% {rnorm(5)}

identical(a,b)
# TRUE
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
enricoferrero
  • 2,249
  • 1
  • 23
  • 28
  • This answer is far simpler than Dirk Eddulbuettel's answer. Does it have any drawbacks? – generic_user May 08 '17 at 17:25
  • 2
    I too was drawn to the simplicity of this solution, but I can't reproduce the result (using R 3.4.0 on Windows 7, `doParallel` 1.0.11, and `foreach` 1.4.3) – whopper510 May 16 '18 at 15:55
  • This obviously *doesn’t* work. The RNG needs to be re-seeded between calls. I’ve taken the liberty of fixing this answer (even though I changed its meaning) since [it’s been misleading people](https://stackoverflow.com/q/64242123/1968). – Konrad Rudolph Oct 07 '20 at 10:32
  • Well, it obviously *used to work* in 2017! Thanks for updating the code. – enricoferrero Oct 08 '20 at 13:37
5

Is the doRNG package any use to you? I suspect your problem is due to two threads both splatting the random seed vector:

http://ftp.heanet.ie/mirrors/cran.r-project.org/web/packages/doRNG/index.html

Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • Thanks for your answer, I really would have like to mark both as an answer, but Dirk's answer was more extensive. I've upvoted your answer nevertheless, since it contains enough information to solve my problem. – mpiktas Dec 02 '11 at 18:16
4

For more complicated loops, you might have to include set.seed() inside of the for loop:

library(foreach)
library(doMC)
registerDoMC(2)
library(doRNG)

set.seed(123)
a <- foreach(i=1:2,.combine=cbind) %dopar% {
  create_something <- c(1, 2, 3)
  rnorm(5)
}
set.seed(123)
b <- foreach(i=1:2,.combine=cbind) %dopar% {
  create_something  <- c(4, 5, 6)
  rnorm(5)
}
identical(a, b)
# FALSE

versus

a <- foreach(i=1:2,.combine=cbind) %dopar% {
  create_something  <- c(1, 2, 3)
  set.seed(123)
  rnorm(5)
}
b <- foreach(i=1:2,.combine=cbind) %dopar% {
  create_something  <- c(4, 5, 6)
  set.seed(123)
  rnorm(5)
}
identical(a, b)
# TRUE
K Bro
  • 359
  • 3
  • 6