Random Number Generation : same C++ code, two different behaviors

Question

My colleague and I are working on a Monte Carlo project together, in C++. She uses Visual Studio, I use Xcode, we shared the code through git. We are computing American option prices thanks to a given method requiring random number generation. We realized we were getting wrong results for a certain parameter K (the higher the parameter, the more wrong the answer), and my colleague found that changing the random source for Mersenne Twister to rand() (though poor generator) made the results good for the whole range of K.

But when I changed the source on my version of the code, it did nothing.

More puzzling for me, I created a new Xcode project, copied inside it her whole source and it still gives me wrong results (while she gets good ones). So it can't stem from the code itself. I cleaned the project, relaunched Xcode, even restarted my computer (...), but nothing changes : our projects behaves consistently but differently, with the same code behind. (EDIT: by differently but consistently, I don't mean that we don't have the same sequence of numbers. I mean that her Monte Carlo estimator converges toward 4. and mine towards 3.)

Do you have any idea of what the cause of this dual behavior could be ?

Here is the random generation code :

double loiuniforme() //uniform law
{
    return (double)((float)rand() / (float)RAND_MAX);
}


vector<double> loinormale() //normal law
{
    vector<double> loinormales(2, 0.);

    double u1 = loiuniforme();
    double v1 = loiuniforme();

    loinormales[0] = sqrt(-2 * log(u1))*cos(2 * M_PI*v1);
    loinormales[1] = sqrt(-2 * log(u1))*sin(2 * M_PI*v1);

    return(loinormales);

}

EDIT : the MT RNG used before was :

double loiuniforme()
{
    mt19937::result_type seed = clock();
    auto real_rand = std::bind(std::uniform_real_distribution<double>(0,1), mt19937(seed));
    return real_rand();
}

that's right. we're clairvoyant. We can just 'see' your code and deduce whats wrong with it without seeing it... — Richard Hodges, Apr 25 '16 at 22:40
I'll take different operating systems and libraries have different random number generators for 98312, Alex. — David Hoelzer, Apr 25 '16 at 22:41
Compare your random generators and report to us. Are the numbers given by `rand()` the same in both places? — Dialecticus, Apr 25 '16 at 22:47
What are you expecting? The same generated sequence of random numbers? — Galik, Apr 25 '16 at 22:52
@Dialecticus I don't know (but I can try to find) but on average they should behave the same, shouldn't they ? I basically always get something between 3.00 and 3.05 and she gets something between 4.05 and 4.10 — user3640096, Apr 25 '16 at 22:52
That does sound like too large a difference to be just a matter of different random number generators, especially if you are finding that `rand` gives "good" results and something fancier doesn't. But using `rand` is a really bad idea -- it may be, and on some systems is, a really terrible RNG. How certain are you that the results with `rand` are actually right? I mean, is there any chance that your idea of what they "should" be is wrong? — Gareth McCaughan, Apr 25 '16 at 22:54
@Galik no, of course, but mathematically the random numbers should behave the same way on average (unless I'm dead wrong, if we all simulated numbers following gaussian(0,1) law and summed them, we would all find 0 on average, not .5 for some and -1.2 for others) — user3640096, Apr 25 '16 at 22:54
@GarethMcCaughan I'm implementing Roger's dual method for pricing options, and in his article he gave the results one should approximately get (and confidence intervals). And being thus "forced" to use rand hurts me deeply, I know it's really a bad generator ;-) — user3640096, Apr 25 '16 at 22:56
One way in which swapping (e.g.) the Mersenne Twister for `rand` could hurt you is if, for instance, `RAND_MAX` is smaller than the largest value the MT can return but you continue to use `RAND_MAX` to normalize your numbers when using the MT. Or, more generally, if there's some other such API difference between `rand` and the MT and your code is assuming `rand`'s API. — Gareth McCaughan, Apr 25 '16 at 22:56
What happens if you swap in some *other* RNG implementations? E.g., on Xcode you could try `random()` and `arc4random`. Do they give "good" or "bad" results? — Gareth McCaughan, Apr 25 '16 at 22:59
The specification for mersenne twister gives the exact algorithm; all implementations should produce the same sequence if you start them off with the same seed. Chances are the problem comes from something else. — Pete Becker, Apr 25 '16 at 23:00
I think all the answers along the lines of "different RNGs give different numbers, duh" are missing the point here, which is that (1) the questioner is seeing bigger differences between implementations than can plausibly be explained by that, and (2) the potentially-really-crappy `rand` is giving plausible results while fancier RNGs like the Mersenne Twister are giving wrong-looking results. To me, that suggests that there's another problem besides the mere fact that different RNGs give different results. — Gareth McCaughan, Apr 25 '16 at 23:02
@GarethMcCaughan bad results for random. Could there be something in Xcode , or a classical code mistake, that has kept traces of my previous computations, and make the program return computations based on these previous results ? (say, I have computed 1+2, now I try 2+3 and it gives me 3 because it's still using 1 and 2, whereas the same variables on the other computer are 2 and 5) — user3640096, Apr 25 '16 at 23:03
I will be astonished if this is actually any part of the problem, but: It's kinda weird to be converting to float and then to double; that will almost certainly be throwing away bits and making your random numbers worse than they need be. I suggest that `loiuniforme` should say something more like `return rand() / (double)RAND_MAX);`. — Gareth McCaughan, Apr 25 '16 at 23:04
Possible duplicate of [Why does the C++ stdlib rand() function give different values for the same seed across platforms?](http://stackoverflow.com/questions/15109427/why-does-the-c-stdlib-rand-function-give-different-values-for-the-same-seed) — Havenard, Apr 25 '16 at 23:05
Well, there *could* be almost anything :-). E.g., if your compiler lets you build "release" and "debug" versions and for some reason you've accidentally built one and run the other, you might be running an old version of the code. But it's hard to diagnose that sort of thing without being able to see more than a tiny fraction of the code... — Gareth McCaughan, Apr 25 '16 at 23:05
@GarethMcCaughan it might be an explanation, but on her computer, the same float <-> double conversion happens. But i'll keep the advice in mind anyway ! — user3640096, Apr 25 '16 at 23:06
Havenard: no, this is obviously not what is bothering the questioner. The question is not of the form "I ran the same program using random numbers on two platforms and one said 20, 23, 1, 68 while the other said 5, 99, 2, 71"; it's of the form "I averaged a hundred uniform(0,1) random numbers on two platforms and got 0.53 on one and 0.88 on the other". Of course it's possible that user3640096 is wrong to think that the discrepancy is too big to be explained by ordinary variation between RNGs, but I wouldn't bet that way. — Gareth McCaughan, Apr 25 '16 at 23:07
If that's of any usefulness, MSVC++ `rand()` is peculiar, since it has a `RAND_MAX` of 32767 (most other libraries nowadays have something like 2^31). I didn't look thoroughly to your code to see if this can matter, but it's a thing that bit me in the past, so hey, maybe it can be useful. — Matteo Italia, Apr 25 '16 at 23:08
So let me repeat my question about whether it's possible that the better RNG is returning random numbers in a larger range but the `loiuniforme` function's denominator isn't being updated to account for that... — Gareth McCaughan, Apr 25 '16 at 23:08
@GarethMcCaughan Yes, of course, but I'm not sure this thread would be happy to digest my entire code ;-) but thanks for your help — user3640096, Apr 25 '16 at 23:08
No, I understand that there are many good reasons for not just dumping your entire codebase into the question. It might be worth considering whether there's a subset of it, larger than the fragment already in the question but small enough to be manageable, that might be informative. (The answer might well be no. Or it might be "yes, but figuring out *what* fragment is exactly as hard as finding whatever bug is causing this".) — Gareth McCaughan, Apr 25 '16 at 23:10
Just to make sure I understand the situation. *Building with Visual Studio*, you get "good" results with `rand` and "bad" results with the Mersenne Twister. *Building with Xcode*, you get "bad" results with both. Is that right? — Gareth McCaughan, Apr 25 '16 at 23:12
@GarethMcCaughan regarding the better RNG, I actually didn't use any denominator, I added the code. And for your last question, yes, indeed. — user3640096, Apr 25 '16 at 23:13
Ooooo. Take a look at this: http://stackoverflow.com/questions/14023880/c11-random-numbers-and-stdbind-interact-in-unexpected-way and see whether it might be relevant. — Gareth McCaughan, Apr 25 '16 at 23:13
(What did your code with `random()` look like? I'm guessing more like the code with `rand()` did.) — Gareth McCaughan, Apr 25 '16 at 23:15
Do you have all compiler warnings turned on? It may be something like an uninitialized variable that on your system had a randomly large value but on your colleagues system is zero initialized or something. — Galik, Apr 25 '16 at 23:17
Wait a minute! You seeded the Mersenne twister *each time* you call the function? How about using the same object and seeding it only once? — Dialecticus, Apr 25 '16 at 23:21
@GarethMcCaughan I have checked the loiuniforme results with MT, they "look" random (unlike the "with bind" results in the post) — user3640096, Apr 25 '16 at 23:22
You shouldn't keep re-seeding the generator, you should create only one `mt19937` and keep using that. — molbdnilo, Apr 25 '16 at 23:23
@Galik I have the warning turned on, yes, but it doesn't mention anything about an uninitialized variable — user3640096, Apr 25 '16 at 23:23
@molbdnilo it might be a silly question, but is it in mt19937::result_type seed = clock(); that I am creating a mt19937 object ? If so, I should do it in the main as a global variable or something like that ? — user3640096, Apr 25 '16 at 23:25

score 3 · Answer 1 · answered Apr 25 '16 at 22:59

The C++ standard does not specify what algorithm is used by rand(). Whoever wrote the compiler is free to use whatever implementation they want, and there is no guarantee that it will behave the same on two different compilers, on two different architectures or even two different versions of the same compiler.

score 2 · Answer 2 · answered Apr 25 '16 at 23:40

You should only create one generator and use that for every number.

mt19937::result_type seed = clock();

and

mt19937(seed)

create a new generator, with a new seed, every time you call the function.
This causes the randomness to get all twisted.

You can use static variables in the function, since these are initialised on the first call:

double loiuniforme() 
{ 
    static std::mt19937 generator(clock()); 
    static std::uniform_real_distribution<double> distribution(0, 1); 
    return distribution(generator); 
}

(When you're comparing results with your colleague, use the same hardcoded seed to verify that you are getting the same results.)

score 0 · Answer 3 · answered Apr 25 '16 at 22:51

You need to seed the rand function with the same number on both computers. And even then I'm not sure that the underlying code across computers and operating systems will return the same value.

More importantly, if you want identical results, don't use a random function.

Random Number Generation : same C++ code, two different behaviors

3 Answers3