Your algorithm does not generate a uniform distribution.
expandedRands[(100000 * i) + j] = rands[i] * (j / 100000);
First, for each initial random value 𝑖 you generate 100,000 values in the range [0,𝑖). This clearly skews the distribution toward lower values.
Furthermore, every value in the final data is generated from just one of the initial 10 values, and those are all evenly spaced. This leaks quite a bit of information to observers and means they'll be able to guess more values in the final array with a pretty good likelihood of making correct guesses.
Presumably you need to stretch 10 calls to rand()
into 1,000,000 quality random numbers because rand()
is very slow (and hopefully generates very good random data in return). What I would do under these circumstances is use the results of rand()
as nothing more than a seed for a good, deterministic pRNG.
Some code, including C++ facilities for implementing this idea:
// initialize a vector of 10 quality pseudorands [0,RAND_MAX]
int rands[10];
for(int i = 0; i < 10; ++i) { rands[i] = rand(); }
std::seed_seq seeds(begin(rands), end(rands));
// seed_seq is from C++ and performs a standard RNG 'warm-up' sequence
// In other languages you'll simply implement a warm-up sequence yourself.
std::mt19937 eng(seeds);
// mt19937 is an implementation of a standard RNG.
// the seed_seq ensures a good initial state for producing random bits
// You can use whatever standard pRNG algorithm meets your quality/performance/size needs
// For example, if you need something faster and with a smaller state you could use a linear congruential engine such as minstd_rand0
std::uniform_real_distribution<double> dist(0.0, 1.0);
// a C++ object which takes random bits and produces random values with a good distribution.
// there are many different algorithms for doing this
double expandedRands[1_000_000];
for(int i = 0; i < 1_000_000; ++j) {
expandedRands[i] = dist(eng);
}
expandedRands
now contains one million values uniformly distributed in the range [0.0, 1.0). Given the same initial 10 random values you will get the same million output values, and any difference in the input should produce quite different output.
If you're stretching rand()
's results because you need something that is more parallelizable than serialized calls to rand()
, then what you can do is use the ten rand()
calls to generate a seed sequence, and then use that to seed several independent pRNG engines that could be run on different cores or in independent instances of a GPGPU kernel (if you can implement the pRNG and distribution in CUDA or whatever).
int rands[10];
for (int i = 0; i < 10; ++i) { rands[i] = rand(); }
std::seed_seq seeds(begin(rands), end(rands));
std::mt19937 eng[10];
for (int i = 0; i < 10; ++i) { eng.seed(seeds); }
// now the engines can be used on independent threads.
P.S. I know your code is only pseudo-code, but I've seen a certain mistake in C a fair bit, so just in case you wrote your code this way due to the same misconception about C:
double rands[10] = {rand()};
The initializer in C does not execute that expression 10 times and initialize each element with a different value. What happens in C is that, when there are fewer initializers than elements in the array, the initializers that are there get assigned to their corresponding elements (first initializer to the first element, second initializer to the second element, etc.) and then the rest of the elements are zero initialized. So for example:
int x[10] = {0};
will initialize the whole array to zeros, but:
int x[10] = {1};
will initialize the first element to one and then the rest to zero.