Unexpected behavior when using R sample function with rpy2?

Question

I need to cross-validate an R code in python. My code contains lots of pseudo-random number generations, so, for an easier comparison, I decided to use rpy2 to generate those values in my python code "from R".

As an example, in R, I have:

set.seed(1234)
runif(4)
[1] 0.1137034 0.6222994 0.6092747 0.6233794

In python, using rpy2, I have:

import rpy2.robjects as robjects
set_seed = robjects.r("set.seed")
runif =  robjects.r("runif")
set_seed(1234)
print(runif(4))
[1] 0.1137034 0.6222994 0.6092747 0.6233794

as expected (values are similar). However, I face a strange behavior with the R sample function (equivalent to the numpy.random.choice function).

As the simplest reproducible example, I have in R:

set.seed(1234)
sample(5)
[1] 1 3 2 4 5

while in python I have:

sample =  robjects.r("sample")
set_seed(1234)
print(sample(5))
[1] 4 5 2 3 1

The results are different. Could anyone explain why this happens and/or provide a way to get similar values in R and python using the R sample function?

score 1 · Accepted Answer · answered Jan 15 '21 at 11:06

If you print the value of the R function RNGkind() in both situations, I suspect you won't get the same answer. The Python result looks like the default output, while your R result looks like the old buggy output.

For example, in R:

set.seed(1234, sample.kind = "Rejection")
sample(5)
#> [1] 4 5 2 3 1
set.seed(1234, sample.kind = "Rounding")
#> Warning in set.seed(1234, sample.kind = "Rounding"): non-uniform 'Rounding'
#> sampler used
sample(5)
#> [1] 1 3 2 4 5
set.seed(1234, sample.kind = "default")
sample(5)
#> [1] 4 5 2 3 1

^{Created on 2021-01-15 by the reprex package (v0.3.0)}

So it looks to me as though you are still using the old "Rounding" method in your R session. You probably saved a workspace a long time ago, and have reloaded it since. Don't do that, start with a clean workspace each session.

That was it! Thank you so much. – Regis Jan 15 '21 at 11:29 — Regis, Jan 15 '21 at 11:29

score 0 · Answer 2 · answered Jan 15 '21 at 10:27

0

Maybe give this a shot (stackoverflow answer from here). Quoting the answer : "The p argument corresponds to the prob argument in the sample()function"

import numpy as np
np.random.choice(a, size=None, replace=True, p=None)

answered Jan 15 '21 at 10:27

thehand0

1,123
4
14

Unexpected behavior when using R sample function with rpy2?

2 Answers2