For debugging purposes, I need to generate the same random sequence in all the threads of a given block using cuda random library, curand.
I tried with zero seed and zero sequence, with marsenne twister and xorwow, but I still get two different sequences if the block has different number of threads.
For example, with curand_init(0, 0, 0, &state)
and one thread, I get two numbers:
0.442526 0.809567
with the same initialization code, but two threads, I get: 0.446065 0.730273
Given that I do not really care about the engine used for now, how can I get the same random sequence given the seed parameters and independently from the number of threads in that block?