5

How does Python seed its Mersenne twister pseudorandom number generator used in the built-in random library if no explicit seed value is provided? Is it based on the clock somehow? If so, is the seed found when the random module is imported or when it is first called?

Python's documentation does not seem to have the answer.

DanielTuzes
  • 2,494
  • 24
  • 40

4 Answers4

3

In modern versions of python (c.f. http://svn.python.org/projects/python/branches/release32-maint/Lib/random.py) Random.seed tries to use 32 bytes read from /dev/urandom. If that doesn't work, it uses the current time: (a is an optional value which can be used to explicitly seed the PRNG.)

    if a is None:
        try:
            a = int.from_bytes(_urandom(32), 'big')
        except NotImplementedError:
            import time
            a = int(time.time() * 256) # use fractional seconds
bks
  • 1,360
  • 6
  • 7
2

From this answer, I found the source of random.py. In the Random class, the seed is set when the object is constructed. The module instantiates a Random object and uses it for all of the module methods. So if the random number is produced with random.random() or another module method, then the seed was set at the time of the import. If the random number is produced by another instance of Random, then the seed was set at the time of the construction of that instance.

From the source:

# Create one instance, seeded from current time, and export its methods
# as module-level functions.  The functions share state across all uses
#(both in the user's code and in the Python libraries), but that's fine
# for most programs and is easier for the casual user than making them
# instantiate their own Random() instance.
Community
  • 1
  • 1
isaach1000
  • 1,819
  • 1
  • 13
  • 18
2

The seed is based on the clock or (if available) an operating system source. The random module creates (and hence seeds) a shared Random instance when it is imported, not when first used.

References

Python docs for random.seed:

random.seed(a=None, version=2)

Initialize the random number generator.

If a is omitted or None, the current system time is used. If randomness sources are provided by the operating system, they are used instead of the system time (see the os.urandom() function for details on availability).

Source of random.py (heavily snipped):

from os import urandom as _urandom

class Random(_random.Random):

    def __init__(self, x=None):
        self.seed(x)

    def seed(self, a=None, version=2):
        if a is None:
            try:
                a = int.from_bytes(_urandom(32), 'big')
            except NotImplementedError:
                import time
                a = int(time.time() * 256) # use fractional seconds

# Create one instance, seeded from current time, and export its methods
# as module-level functions.  The functions share state across all uses
#(both in the user's code and in the Python libraries), but that's fine
# for most programs and is easier for the casual user than making them
# instantiate their own Random() instance.

_inst = Random()

The last line is at the top level, so it is executed when the module is loaded.

tom
  • 21,844
  • 6
  • 43
  • 36
  • By following the Wikipedia article, the 1st state variable should be 0 (and the next should be 1) if the seed 0 is used (as a single uint32), however, seeding from 0 and passing version=1 don't make the 1st state variable 0 using python's random. Do you know why? Try it out: online-python.com/Pxp4KycsC – DanielTuzes Nov 02 '21 at 13:22
  • 1
    @DanielTuzes There are two ways to initialise the state: using [init_genrand](https://github.com/python/cpython/blob/e346f196819aeb02a8a94205ce3e1536c4c2f105/Modules/_randommodule.c#L183) (integer seed, corresponding to the initialisation described in Wikipedia with the magic number 1812433253), or using *init_by_array* (array as seed, more complex). Python always uses *init_by_array*, even if the supplied seed is an integer (there are no references to *init_genrand* other than in *init_by_array*), so the initialisation doesn't correspond to Wikipedia. – tom Nov 03 '21 at 15:33
  • I didn't know what `seed` from [`random.py`](https://github.com/python/cpython/blob/main/Lib/random.py#L163) accesses, thanks for helping! How could I see that `seed` calls only that? (to help myself out next time) It is strange though, bc numpy.random(uint32) uses Wikipedia's init method. – DanielTuzes Nov 03 '21 at 21:32
  • 1
    I saw that the implementation is in a C module – the class Random in *random.py* [inherits](https://github.com/python/cpython/blob/fd0c84dc28d0/Lib/random.py#L103) from a class called Random in the module `_random` (by [convention](https://www.python.org/dev/peps/pep-0008/#package-and-module-names) accompanying C modules have a leading underscore). I located the C module and worked backwards: the init functions were documented at the top; I saw that *init_genrand* matched Wikipedia (the magic number 1812433253 helped), and did a global search for "init_genrand" to find the callers. – tom Nov 04 '21 at 17:17
  • (This was easier than tracing forward from `super().seed(a)` to where [the int is converted to an array](https://github.com/python/cpython/blob/e346f196819a/Modules/_randommodule.c#L311).) – tom Nov 04 '21 at 17:17
0

The other answers are correct, but to summarize something from comments above which might be missed by someone else looking for the answer I tracked down today:

The typical reference implementations of Mersenne Twister take a seed and then internally (usually in the constructor) call this.init_genrand(seed)

If you do that and use a simple number you will get different results than what Python uses -- and probably wonder why like I did.

In order to get the same results in another language (node.js in my case) that you would in python you need an implementation which supports the init_by_array method and then initialize it with init_by_array([seed]).

This example is if you're just using a simple 32 bit int val -- if your seed is something else then python passes it in a different way (e.g. larger than 32 bit numbers are split up and sent in 32 bits per array element, etc) but that should at least help someone get going in the right direction.

The node.js implementation I ended up using was https://gist.github.com/banksean/300494 and it worked beautifully. I could not find one in npm which had the support I needed -- might have to add one.

taxilian
  • 14,229
  • 4
  • 34
  • 73