5

I need to run a MonteCarlo simulations in parallel on different machines. The code is in c++, but the program is set up and launched with a python script that set a lot of things, in particular the random seed. The function setseed thake a 4 bytes unsigned integer

Using a simple

import time
setseed(int(time.time()))

is not very good because I submit the jobs to a queue on a cluster, they remain pending for some minutes then they starts, but the start time is impredicible, it can be that two jobs start at the same time (seconds), so I switch to:

setseet(int(time.time()*100))

but I'm not happy. What is the best solution? Maybe I can combine information from: time, machine id, process id. Or maybe the best solution is to read from /dev/random (linux machines)?

How to read 4 bytes from /dev/random?

f = open("/dev/random","rb")
f.read(4)

give me a string, I want an integer!

Ruggero Turra
  • 16,929
  • 16
  • 85
  • 141
  • 1
    You havent actually said what consitutes "best". I take it that you are trying to insure that each instance uses a different seed. But, should they be unique between different jobs in a single run, or do you need something approaching (or guaranteed to be) global uniqueness (all run and all jobs). Secondly, do you ever need to be able to repeat a run with the same seeds (sometimes helpful in debugging intermittent). And there may be other complications. – dmckee --- ex-moderator kitten Mar 07 '10 at 21:24
  • I want random seed for every instance, so if the seed is from 0 to 2^(8*4)-1 it's very probable that the seeds are different for every instances. I don't force the seeds to be different, even if maybe it would be better if they are. I think it's not a very big problem. I don't need to repeeat run with the same seed. – Ruggero Turra Mar 08 '10 at 07:50
  • Well, that's the easy case and you have good answers already. Cheers. – dmckee --- ex-moderator kitten Mar 08 '10 at 17:23

4 Answers4

5

Reading from /dev/random is a good idea. Just convert the 4 byte string into an Integer:

f = open("/dev/random","rb")
rnd_str = f.read(4)

Either using struct:

import struct
rand_int = struct.unpack('I', rnd_string)[0]

Update Uppercase I is needed.

Or multiply and add:

rand_int = 0
for c in rnd_str:
    rand_int <<= 8
    rand_int += ord(c)
ebo
  • 8,985
  • 3
  • 31
  • 37
2

You could simply copy over the four bytes into an integer, that should be the least of your worries.

But parallel pseudo-random number generation is a rather complex topic and very often not done well. Usually you generate seeds on one machine and distribute them to the others.

Take a look at SPRNG, which handles exactly your problem.

Joey
  • 344,408
  • 85
  • 689
  • 683
1

If this is Linux or a similar OS, you want /dev/urandom -- it always produces data immediately.

/dev/random may stall waiting for the system to gather randomness. It does produce cryptographic-grade random numbers, but that is overkill for your problem.

Daniel Newby
  • 2,392
  • 18
  • 15
0

You can use a random number as the seed, which has the advantage of being operating-system agnostic (no /dev/random needed), with no conversion from string to int:

Why not simply use

random.randrange(-2**31, 2**31)

as the seed of each process? Slightly different starting times give wildly different seeds, this way…

You could also alternatively use the random.jumpahead method, if you know roughly how many random numbers each process is going to use (the documentation of random.WichmannHill.jumpahead is useful).

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260