0

I need a PRNG for a simulation project, and found a resource which, given my limited but not nonexistent knowledge of PRNG's seems sound and well-informed. I'm trying to encapsulate the algorithms given in this report in a class, but for some reason I get a lot of repeated values. It might just be that the PRNG isn't as good as the report states it is, but I suspect that it's rather something in my implementation that fails.

Background: PRNG code from the report

The following code sample is given on page 3:

/* Public domain code for JKISS RNG */
// seed variables
static unsigned int x = 123456789,y = 987654321,z = 43219876,c = 6543217;

unsigned int JKISS()
{
    unsigned long long t;
    x = 314527869 * x + 1234567;
    y ^= y << 5; y ^= y >> 7; y ^= y << 22;
    t = 4294584393ULL * z + c; c = t >> 32; z = t;
    return x + y + z;
}

In connection with it, the report claims that

The period of JKISS is ≈2 127 = 1.7x10 38 (2 32 x (2 32 -1) x (1/2 * 4294584393 x 2 32 - 1)) and it passes all of the Dieharder tests and the complete BigCrunch test set in TestU01.

so this definitely seems good enough to me. Later in the report (page 6), it is stated

The following C code generate [sic] a random (double precision) floating point number 0 <= x < 1:

double x;
x = JKISS() / 4294967296.0;

My encapsulation of this in a class

I have the following in a header file:

class JKISS : public IPRNG {
private:
    // Seed variables
    static unsigned int x;
    static unsigned int y;
    static unsigned int z;
    static unsigned int c;


public:
    static unsigned int randui32();
    static double randdouble();
};

with the following implementation file

#include "prng.hpp"

unsigned int JKISS::x = 123456789;
unsigned int JKISS::y = 987654321;
unsigned int JKISS::z = 43219876;
unsigned int JKISS::c = 6543217;

unsigned int JKISS::randui32() {
    unsigned long long t;
    x = 314527869 * x + 1234567;
    y ^= y << 5; y ^= y >> 7; y ^= y << 22;
    t = 4294584393ULL * z + c; c = t >> 32; z = t;
    return x + y + z;
}

double JKISS::randdouble() {
    return randui32() / 4294967296.0;
}

and the following main program

#include <iostream>
#include "prng.hpp"

int main() {
    for (int i = 0; i < 10000; ++i) {
        std::cout << JKISS::randdouble() << std::endl;
    }
}

As you can see, I've copy-pasted most of the code.

However, when I run this, I get 68 repeated values, even though I'm just fetching 10 000 values. This suggests to me that something is wrong with my encapsulation, but I can't figure out what the problem is.

In case it matters, I'm running GCC 4.8.1 on Ubuntu 13.10 with the following specs:

Platform Info:
  System: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU           E5410  @ 2.33GHz
  WORD_SIZE: 64

Any ideas as to what could cause this are most welcome.

Tomas Aschan
  • 58,548
  • 56
  • 243
  • 402
  • After some warm-up, I've been able to decrease the number of repeated values. If I discard 105 values, I get down to 49 repeated - if I discard 109, I get 46 repeated. I don't know if it will saturate or if I can get it all the way down to 0 just by warming up more, but I feel it shouldn't be needed more than this. – Tomas Aschan Apr 24 '14 at 20:16
  • Your class would be reusable if you removed "static" and initialized x, y, z, & c in a ctor. – brian beuning Apr 24 '14 at 20:26
  • Random should have occasional duplicates, or it would not be random. I agree 68 out of 10,000 does seem a little high. – brian beuning Apr 24 '14 at 20:27
  • @brianbeuning: Sure, I'll probably make all of it instance variables and methods rather than class stuff in the final version, but for now I wanted to mimic the original code as closely as possible to see why it's performing poorly. Since the integer distribution *should* converge to a uniform distribution over 4294967295 (approx 10^9) values, it will take *very* long to converge if as much as 0.5% are repeated. – Tomas Aschan Apr 24 '14 at 20:34
  • I put it on ideone and got repeats but then changed return type to double and no repeats: http://ideone.com/hGknvh – 001 Apr 24 '14 at 20:36
  • @JohnnyMopp: Hm. You still got very much fewer repeats than I do (only a handful), when you got anything at all. I'm beginning to think this could be system-dependent, but I can't find the specs of IdeOne anywhere on their site. Could it be that the code in the report has other sizes for `unsigned int` and `unsigned long` than I do? If so, how do I ensure that the algorithm uses the correct sizes of variables? – Tomas Aschan Apr 24 '14 at 21:00
  • What format is the output of the floating-point values? If it's six significant figures (I think the default for %f) then you should expect more repeats because there are fewer possible _display_ values. – sh1 Apr 25 '14 at 16:43

4 Answers4

1

You are returning a float, while the original code generates a double.

danielschemmel
  • 10,885
  • 1
  • 36
  • 58
0

What for? What makes you think that some random page on the 'web will give you a better RNG than the ones built-in to your C++ standard library, which provides all sorts of nifty features packaged up and ready for use? Are you really in position to avaluate if what you found is better for your purposes?

vonbrand
  • 11,412
  • 8
  • 32
  • 52
  • There's no reason to trust a random generator just because it's built-in, especially if you want to ensure portability (which means you can't even trust which *version* of the built-in PRNG will be available). If you don't believe me, take a look at the test results from [TestU01](http://dl.acm.org/citation.cfm?doid=1268776.1268777) (the standard test suite for PRNGs) for a number of common algorithms - surprisingly many of them, including the defaults in Java and Matlab as well as the variants of Unix-random, fail a lot of the tests that the algorithm I'm trying to implement claims to pass. – Tomas Aschan Apr 24 '14 at 21:14
  • I don't know if that paper is available in fulltext everywhere in the world - I'm accessing it from my university network at the moment. However, if you can access it, there's a table on pages 28-29 with the summary of test results. – Tomas Aschan Apr 24 '14 at 21:15
0

I get the same results as you do, but no duplicates ;p

What I mean is that if I run your program as directed I get 68 duplicates, but if I switch randdouble() => randui32() there are no more duplicates, so I'll bet they were an artifact due to the precision of the printing routine.

You could try to collect the doubles in an array and compare to make 100% sure.

loreb
  • 1,327
  • 1
  • 7
  • 6
  • Comparing the values by means of a `std::map`, incrementing the value for each randomly generated key by one, indeed gave 1 duplicate for 1e5 values, rather than 68. Much more reasonable. Good catch! – Tomas Aschan Apr 29 '14 at 09:11
0

It looks like the printed output is only six significant figures, which means that you have something like 1000000 unique display values (this is oversimplified -- there are more possibilities but the distribution is skewed). In effect you print 10000 values from a pool of 1000000. I think 68 duplicates is reasonable in that situation.

sh1
  • 4,324
  • 17
  • 30
  • I'm not entirely certain I linked the right function, by the way. It gave me a predicted 49.83 collisions, but I don't know the correct value for d. – sh1 Apr 25 '14 at 17:23