I'm looking for a highly efficient way to generate random std::bitset
of set length. I'd also like to be able to influence the probability of 1
s appearing in the result, so if the probability value is set low enough, only a small percentage of all the results will even contain a 1
, but it's still possible (but very unlikely) to result in all 1
s. It's going to be used in a very computation-heavy application, so every possible optimization is welcome.
Asked
Active
Viewed 3,430 times
7

4pie0
- 29,204
- 9
- 82
- 118

Kuba Orlik
- 3,360
- 6
- 34
- 49
-
1You might want to look into the new [pseudo-random capabilites in C++11](http://en.cppreference.com/w/cpp/numeric/random). Perhaps create your own distribution if none of the standard ones fits your requirements. – Some programmer dude Aug 07 '14 at 07:23
-
cpu "tick tock" register counter. just query and get a random odd or even number – Tuğrul Aug 07 '14 at 07:37
-
You can use some of the techniques from this java question http://stackoverflow.com/questions/2075912/generate-a-random-binary-number-with-a-variable-proportion-of-1-bits/ – Michael Anderson Aug 07 '14 at 08:36
-
You could first call use the [Box-Muller transform](http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform) which calls `rand()` on average once per random number and generates gausian-distributed random numbers to determine how many bits will be set in your result. You then track total bits remaining, and bits to set (to 1) remaining, iteratively. On each iteration you call `rand()` (or your PRNG of choice) and `mod` it by the total bits that remain. If the value is less than the bits yet to set (as 1s) then you set the bit, otherwise you leave it clear (0). – Apriori Aug 07 '14 at 15:18
-
To optimize this you could try to skim multiple random numbers from the result of your PRNG, and/or subdivide the range into independent pieces, maybe have separate pools for `toset` and `totalremaining`, one for each bit in a byte or so. Then compute 8 weighted random bits at once and distribute them into the byte, or something similar. Or that is one idea that occurs to me anyway. – Apriori Aug 07 '14 at 15:24
-
The most optimal method while likely depend on the density of zeros. If the density of zeros is very high or low then it will be easier to optimize than distributions which are closer to 50% zeros. – Z boson Aug 08 '14 at 07:47
1 Answers
8
Bernoulli distribution is a probability distribution of 1 or 0 in a single experiment. A sum of many such distributed variables
gives a variable distributed with mean n*p (binomial distribution). So by taking n bernoulli distributed bits with probability of 1 given by p we get a bitset of size n and np bits set to 1 on average. Of course this is just a starting point to optimize next if the efficiency this offers is not enough.
#include <iostream>
#include <random>
#include <bitset>
template< size_t size>
typename std::bitset<size> random_bitset( double p = 0.5) {
typename std::bitset<size> bits;
std::random_device rd;
std::mt19937 gen( rd());
std::bernoulli_distribution d( p);
for( int n = 0; n < size; ++n) {
bits[ n] = d( gen);
}
return bits;
}
int main()
{
for( int n = 0; n < 10; ++n) {
std::cout << random_bitset<10>( 0.25) << std::endl;
}
}
result:
1010101001
0001000000
1000000000
0110010000
1000000000
0000110100
0001000000
0000000000
1000010000
0101010000

4pie0
- 29,204
- 9
- 82
- 118
-
The default C++ Mersenne Twister uses an internal state of 624 bytes. The way you seed it provides 4 bytes of entropy, so you can only access 2^32 of the 2^4992 sequences. This makes it easier to predict the sequence. More problematically, some numbers (1,580,024,992 of them) will never appear at all as the first number. ([Source](http://www.pcg-random.org/posts/cpp-seeding-surprises.html)). The bottom line, is that you need as many bytes of entropy as you have state to correctly seed. – Richard Aug 12 '16 at 15:43
-
@Richard This doesn't mean that solution presented doesn't work as you suggested in your comment. It still works giving enough accuracy in almost all practical use-cases. Your comment should be "To improve entropy seed the mt this way..." – 4pie0 Aug 12 '16 at 15:55