10

I'm having trouble determining which variant of Mersenne Twister C++11 provides. Looking at Matsumoto and Nishimura ACM paper at Mersenne twister: A 623 Dimensionally Equidistributed Uniform Pseudorandom Number Generator, the authors provide the algorithm, an implementation of the algorithm, and call it MT19937.

However, when I test C++11's same-named generator with the small program below, I cannot reproduce the stream created by Matsumoto and Nishimura's MT19937. The streams differ from the very first 32-bit word produced.

Which Mersenne Twister does C++11 provide?


The program below was run on Fedora 22 using GCC, -std=c++11 and GNU's stdlibc++.

std::mt19937 prng(102013);
for (unsigned int i = 0; i <= 625; i++)
{
    cout << std::hex << prng();

    if(i+1 != 625)
        cout << ",";

    if(i && i%8 == 0)
        cout << endl;
}
jww
  • 97,681
  • 90
  • 411
  • 885
  • If you look at the [header](http://www.boost.org/doc/libs/release/boost/random/mersenne_twister.hpp) in Boost.Random, they state *The seeding from an integer was changed in April 2005 to address a [weakness](http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html)*. Could it be that you're comparing results from a paper published before that change was made? – Praetorian Oct 08 '15 at 17:05
  • @Praetorian - Well, I'm not sure, but I don't believe so. I'm not using Boost; rather, I am using GNU's implementation via `libstdc++`. – jww Oct 08 '15 at 17:07
  • It uses http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/CODES/mt19937ar.c. IOW, what @Praetorian linked to. – T.C. Oct 08 '15 at 17:08
  • Well, the Boost implementation was the precursor to `std::tr1::mt19937`, which then became `std::mt19937` in C++11, so the Boost comments are likely very relevant. You should follow the second link in my earlier comment and compare against the output presented in there. – Praetorian Oct 08 '15 at 17:10

3 Answers3

4

Looking at the MT19937 from your the linked to paper and the MT19937 defined by the standard it looks like they are the same but an additional layer of tempering was added and an initialization multiplier

If we look at the values defined by [rand.predef] 26.5.5(3) vs the parameters defined by the paper we have

32,624,397,31,0x9908b0df,11,0xffffffff,7,0x9d2c5680,15,0xefc60000,18,1812433253 <- standard
w ,n  ,m  ,r ,a         ,u ,d         ,s,b         ,t ,c         ,l ,f  
32,624,397,31,0x9908b0df,11,          ,7,0x9d2c5680,15,0xefc60000,18,           <- paper

This is where the difference is coming from. Also according to the standard the 10,000th iteration of std::mt19937 is 399268537

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • For the 32-bit version, the `0xffffffff` value for `d` means that no change actually happens. – T.C. Oct 08 '15 at 17:20
  • The original paper used 69069 as the multiplier (1812433253 is part of the AR revision and the standard). – jww Oct 08 '15 at 19:47
  • I can't help but feel C++ should *not* have used the name `MT19937` because its *not* using the original parameters. `EMT19937`, `MT19937ar`, `eMT19937ar`, etc would have been better choices. In fact, it looks like ["AR" (for "Array") is what the authors use to differentiate this change](http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html) from the original. – jww Oct 12 '15 at 00:36
  • @jww I still think it should be called mt19937 as it is the Mersenne Twister and it does have a period of 2^19937 – NathanOliver Oct 12 '15 at 12:04
1

It seems C++11 provides Mersenne Twister with improved initialization

I've just extracted original C implementation, and compared with C++.

#include <iostream>
#include <cstdio>
#include <random>

#define N 624
#define M 397
#define MATRIX_A 0x9908b0dfUL   /* constant vector a */
#define UPPER_MASK 0x80000000UL /* most significant w-r bits */
#define LOWER_MASK 0x7fffffffUL /* least significant r bits */

static unsigned long mt[N]; /* the array for the state vector  */
static int mti=N+1; /* mti==N+1 means mt[N] is not initialized */

void init_genrand(unsigned long s)
{
    mt[0]= s & 0xffffffffUL;
    for (mti=1; mti<N; mti++) {
        mt[mti] =
        (1812433253UL * (mt[mti-1] ^ (mt[mti-1] >> 30)) + mti);
        mt[mti] &= 0xffffffffUL;
    }
}

unsigned long genrand_int32()
{
    unsigned long y;
    static unsigned long mag01[2]={0x0UL, MATRIX_A};

    if (mti >= N) { /* generate N words at one time */
        int kk;

        if (mti == N+1)   /* if init_genrand() has not been called, */
            init_genrand(5489UL); /* a default initial seed is used */

        for (kk=0;kk<N-M;kk++) {
            y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
            mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & 0x1UL];
        }
        for (;kk<N-1;kk++) {
            y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);
            mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & 0x1UL];
        }
        y = (mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK);
        mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & 0x1UL];

        mti = 0;
    }

    y = mt[mti++];

    y ^= (y >> 11);
    y ^= (y << 7) & 0x9d2c5680UL;
    y ^= (y << 15) & 0xefc60000UL;
    y ^= (y >> 18);

    return y;
}

int main()
{
    init_genrand(102013);

    std::mt19937 prng(102013);

    for (size_t i = 0; i < 10000; ++i) {
       if (genrand_int32() != prng()) {
          std::cout << "ERROR" << std::endl;
          return 1;
       }
    }

    std::cout << "OK" << std::endl;
    return 0;
}
Stas
  • 11,571
  • 9
  • 40
  • 58
  • So it sounds like they are providing `MT19937ar`, and not `MT19937`. Is that correct? – jww Oct 08 '15 at 19:37
  • @jww No, the correct name is `MT19937`. It seems `MT19937ar` is just a name of the file, where they added array initialization. – Stas Oct 08 '15 at 19:43
  • @jww Also, see the list of C implementations: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/C-LANG/c-lang.html There were two versions (1998 and 1999) that are outdated now. – Stas Oct 08 '15 at 19:47
1

I should point out that C++11 actually provides many Mersenne twisters via the template class:

template <class UIntType,
          size_t word_size,
          size_t state_size,
          size_t shift_size,
          size_t mask_bits,
          UIntType xor_mask,
          size_t tempering_u,
          UIntType tempering_d,
          size_t tempering_s,
          UIntType tempering_b,
          size_t tempering_t,
          UIntType tempering_c,
          size_t tempering_l,
          UIntType initialization_multiplier>
  class mersenne_twister_engine;

If anyone has the bravery to explore these levers and knobs... Of course there are the two standard instantiations of these:

using mt19937
  = mersenne_twister_engine<uint_fast32_t,
                            32,
                            624,
                            397,
                            31,
                            0x9908b0df,
                            11,
                            0xffffffff,
                            7,
                            0x9d2c5680,
                            15,
                            0xefc60000,
                            18,
                            1812433253>;

and the 64-bit version:

using mt19937_64
  = mersenne_twister_engine<uint_fast64_t,
                            64,
                            312,
                            156,
                            31,
                            0xb5026f5aa96619e9,
                            29,
                            0x5555555555555555,
                            17,
                            0x71d67fffeda60000,
                            37,
                            0xfff7eee000000000,
                            43,
                            6364136223846793005>;

I've thought it would be nice to provide a toolbox for checking the quality of RNGs so people could try new instantiations.

Here is a comparison of template parms:

32,624,397,31,        0x9908b0df,11,        0xffffffff,7 ,        0x9d2c5680,15,        0xefc60000,18,1812433253          <- std::mt19937
64,312,156,31,0xb5026f5aa96619e9,29,0x5555555555555555,17,0x71d67fffeda60000,37,0xfff7eee000000000,43,6364136223846793005 <- std::mt19937_64
w ,n  ,m  ,r ,a                 ,u ,d                 ,s ,b                 ,t ,c                 ,l ,f  
32,624,397,31,        0x9908b0df,11,                  ,7 ,        0x9d2c5680,15,        0xefc60000,18,                    <- paper

With thanks to @NathanOliver.

emsr
  • 15,539
  • 6
  • 49
  • 62