0

We might think the functions in random module would generate same results for a given seed, but in fact it doesn't that exactly, e.g. here.

The following results are expected to be reproducible, but something different was once generated abruptly. Why?

Unfortunately I did neither log the error nor could reproduce the problem so far.

import random  # python 3.8.2                                                                                                       

random.seed(0)                                                                                                        
rng = range(30)                                                                                                         
seq = [random.sample(rng, random.randint(0, 3)) for _ in rng]                                                               
print(seq)

There are only sparse statements about pseudo random reproducibility. What are potential pitfalls to use random.*, if absolute reproducibility is required?

sof
  • 9,113
  • 16
  • 57
  • 83
  • Is this your entire code which produced the problem, or did you note the problem in a larger program where you were doing threading? Note that reproducibility cannot be guaranteed with threading because the order of calls to the underlying generator can vary based on thread scheduling. – pjs May 13 '20 at 18:16
  • The larger program had only once produced different result. No threading. No IO except *print*. The aforementioned random seq was the sole source of input that led to the result. – sof May 13 '20 at 18:43
  • If you can't reproduce it and we can't reproduce it then we probably should close the question as not reproducible. At least you can enjoy the irony that the question itself concerns reproducibility. Note that you are using the global instance of the Random class. Any other part of your program that touches this will affect the reproducibility of the sequence. If that is a concern than just instantiate your own Random instance. – President James K. Polk May 13 '20 at 18:44
  • The random module had been imported and used nowhere else. Sounds fair to close it. – sof May 13 '20 at 19:13

1 Answers1

2

The questioner on your linked question misunderstood what reproducibility means. That questioner expected that random.sample(population, x+y) would start with the same elements as random.sample(population, x), followed by y additional elements. There is no such requirement.

Reproducibility means that if you perform the same sequence of RNG calls with the same seed, you get the same output. random.sample(population, x+y) is not the same call as random.sample(population, x), so there is no requirement that they produce the same output.


As for the random module's actual reproducibility guarantees, the docs say the following:

Notes on Reproducibility

Sometimes it is useful to be able to reproduce the sequences given by a pseudo random number generator. By re-using a seed value, the same sequence should be reproducible from run to run as long as multiple threads are not running.

Most of the random module’s algorithms and seeding functions are subject to change across Python versions, but two aspects are guaranteed not to change:

  • If a new seeding method is added, then a backward compatible seeder will be offered.

  • The generator’s random() method will continue to produce the same sequence when the compatible seeder is given the same seed.

Reproducibility is guaranteed from run to run, but not across Python versions, except for the random() call with a compatible seeder.

user2357112
  • 260,549
  • 28
  • 431
  • 505