There has been a large number of parallel RNG questions here, but I couldn't find one which addresses my variant.
I'm writing a function that given a seed fills a long array with random numbers based on that seed. I currently do that serially, but I find that the RNG is taking a significant amount of the running time of my program. I therefore want to speed up my function by using multiple threads. But I want this to be transparent to the user. That is, given a seed, one should get the same random number sequence out independent of the number of threads the function uses internally.
My current idea for doing this is to divide the array into chunks (independently of the number of threads), and generate a new RNG for every chunk, for example by seeding each RNG with the seed+chunk_id. The chunks could then be processed independently, and it would not matter which thread processes which chunk. But I am worried that this might reduce the quality of the RNG. Is this a safe way of doing this for a high-quality RNG like mersenne twister?
To illustrate, here is some pseudo-code for the process:
function random(array, seed, blocksize=100000)
for each block of size blocksize in array
rng[block] = create_rng(seed+i)
parallel for each block in array
for each sample in block
array[sample] = call_rng(rng[block])
This should produce the same values for each (seed,blocksize) combination. But is this the best way of doing this?