Why is it allowed to keep the bottom word of nextLong() signed?

Question

The java.util.Random class has a method nextLong() which on it's own calls next(32) returning a random signed integer.

public long nextLong() {
    // it's okay that the bottom word remains signed.
    return ((long)(next(32)) << 32) + next(32);
}

How come keeping the bottom word signed does not impact the quality of the randomly generated numbers? When constructing an example a bit in the middle of the generated long value is zeroed if the bottom word is negative.

final long INTEGER_MASK = 0xFFFFFFFFL;

int upper = Integer.MAX_VALUE;
int bottom = -1;

System.out.printf("%14s %64s%n","Upper:",Long.toBinaryString(((long)upper << 32)));
System.out.printf("%14s %64s%n","Lower:",Long.toBinaryString((long)bottom));

System.out.printf("%14s %64s%n"," Lower Masked:",Long.toBinaryString(((long)bottom)& INTEGER_MASK));

long result = ((long)upper << 32) + bottom;
System.out.printf("%14s %64s%n","Result:",Long.toBinaryString(result));

//Proper
long resultMasked = ((long)upper << 32) + (((long)bottom & INTEGER_MASK));
System.out.printf("%14s %64s%n%n","Masked",Long.toBinaryString(resultMasked));


Upper:  111_1111_1111_1111_1111_1111_1111_1111_0000_0000_0000_0000_0000_0000_0000_0000
Lower: 1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111
Lower Mask:                                    1111_1111_1111_1111_1111_1111_1111_1111    
Result: 111_1111_1111_1111_1111_1111_1111_1110_1111_1111_1111_1111_1111_1111_1111_1111
Masked  111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111_1111

Currently the lower word contributes 33 bits while the upper only has 32 bits. Even if the upper word is negative due to the 32 bit shift it won't wrap around. I am aware of the javadocs stating that:

Because class {@code Random} uses a seed with only 48 bits, this algorithm will not return all possible {@code long} values.

In this case it might not be detrimental but e.g. gmu's MersenneTwister implementation uses exactly this function call. Doesn't this impact the quality of random numbers generated? What am I missing here?

"Currently the lower word contributes 33 bits" - How did you come to this conclusion? The 32nd bit is the sign bit; no 33rd bit exists. — Jacob G., Sep 16 '18 at 01:42
@JacobG. Because the 33rd-63th bit are represented by 1's in the upper word example and yet the 33rd bit of the result is a 0 as seen by the output (result). — Kilian, Sep 16 '18 at 06:54

score 1 · Answer 1 · edited Jun 20 '20 at 09:12

Currently the lower word contributes 33 bits while the upper only has 32 bits.

...

What am I missing here?

As Jacob G points out, the lower word contributes 32 bits. Not 33 bits. A 32-bit integer is (roughly speaking) 31 bits of precision plus a sign bit.

In this case we are just treating those 31 + 1 bits as just bits. So what the code does it to take two sequences of 32 uniformly distributed "random" bits, join them together to give one sequence of 64 uniformly distributed "random" bits ... which it then returns as a long.

Why is it allowed to keep the bottom word of nextLong() signed?

1 Answers1