2

I am trying to generate 1000 numbers using exponential distribution with parameter 1.

After setting the seed value to 1, I tried both rexp(1000, 1) and replicate(1000, rexp(1, 1)), but the medians of the resulting two vectors are different.

I expected the vectors generated by the two expressions to be the same, because they were both sampled from the same exponential distribution under the same seed value.

What is the difference between rexp(1000, 1) and replicate(1000, rexp(1, 1))? Which should I use in practice?

Here is the code that I tried:

> options(digits = 2)  
> set.seed(1)
> 
> a <- rexp(1000, 1)   
> b <- replicate(1000, rexp(1, 1))
> 
> median(a)   
[1] 0.73   
> median(b)   
[1] 0.68
Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48
Kana
  • 21
  • 3

1 Answers1

5

The problem here is that the random seed changes after it is used, so your seed of 1 is different when you generate b. You have to reset the seed before you create b if you want it to be the same as a

set.seed(1)

a <- rexp(1000, 1)

set.seed(1)

b <- replicate(1000, rexp(1, 1))

median(a)
#> [1] 0.7346113

median(b)
#> [1] 0.7346113

As for which you should use, it is definitely rexp(1000, 1), because this generates a single call to the underlying C code as opposed to 1000 calls. Although we can see from above that the two codes generate the same results, a simple benchmark shows that rexp is about 50 times faster.

microbenchmark::microbenchmark(a = rexp(1000, 1), 
                               b = replicate(1000, rexp(1, 1)))
#> Unit: microseconds
#>  expr      min        lq       mean   median       uq       max neval cld
#>     a   32.501   33.5005   34.54794   34.101   34.701    42.301   100  a 
#>     b 1503.402 1539.0010 2043.20113 1569.451 1646.901 10051.202   100   b

Created on 2023-02-27 with reprex v2.0.2

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87