2

I am a physicist, writing a program that involves generating several (order of a few billions) random numbers, drawn from a Gaussian distribution. I am trying to use C++11. The generation of these random numbers is separated by an operation that should take very little time. My biggest worry is if the fact that I am generating so many random numbers, with such a little time gap, could potentially lead to sub-optimal performance. I am testing certain statistical properties, which rely heavily on the independence of the randomness of the numbers, so, my result is particularly sensitive to these issues. My question is, with the kinds of numbers I mention below in the code (a simplified version of my actual code), am I doing something obviously (or even, subtly) wrong?

#include <random>

// Several other includes, etc.

int main () {

  int dim_vec(400), nStats(1e8);
  vector<double> vec1(dim_vec), vec2(dim_vec);

  // Initialize the above vectors, which are order 1 numbers.

  random_device rd;
  mt19937 generator(rd());
  double y(0.0);
  double l(0.0);

  for (int i(0);i<nStats;i++)
    {
      for (int j(0);j<dim_vec;j++)
        {
          normal_distribution<double> distribution(0.0,1/sqrt(vec1[j]));
          l=distribution(generator);
          y+=l*vec2[j];
        }
      cout << y << endl;
      y=0.0;
    }
}
Chris Drew
  • 14,926
  • 3
  • 34
  • 54
contrariwise
  • 117
  • 1
  • 10

2 Answers2

6

The normal_distribution is allowed to have state. And with this particular distribution, it is common to generate numbers in pairs with every other call, and on the odd calls, return the second cached number. By constructing a new distribution on each call you are throwing away that cache.

Fortunately you can "shape" a single distribution by calling with different normal_distribution::param_type's:

 normal_distribution<double> distribution;
 using P = normal_distribution<double>::param_type;
 for (int i(0);i<nStats;i++)
    {
      for (int j(0);j<dim_vec;j++)
        {
          l=distribution(generator, P(0.0,1/sqrt(vec1[j])));
          y+=l*vec2[j];
        }
      cout << y << endl;
      y=0.0;
    }

I'm not familiar with all implementations of std::normal_distribution. However I wrote the one for libc++. So I can tell you with some amount of certainty that my slight rewrite of your code will have a positive performance impact. I am unsure what impact it will have on the quality, except to say that I know it won't degrade it.

Update

Regarding Severin Pappadeux's comment below about the legality of generating pairs of numbers at a time within a distribution: See N1452 where this very technique is discussed and allowed for:

Distributions sometimes store values from their associated source of random numbers across calls to their operator(). For example, a common method for generating normally distributed random numbers is to retrieve two uniformly distributed random numbers and compute two normally distributed random numbers out of them. In order to reset the distribution's random number cache to a defined state, each distribution has a reset member function. It should be called on a distribution whenever its associated engine is exchanged or restored.

Community
  • 1
  • 1
Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • I understand your argument. But taking every other number from a distribution shouldn't affect the quality of the output. If it does, that is a poor implementation in my opinion. – user515430 Dec 01 '14 at 19:15
  • @user515430: Ok. I was admittedly speculating. I'll amend my answer to just the facts I know. – Howard Hinnant Dec 01 '14 at 19:28
  • `And with this particular distribution, it is common to generate numbers in pairs with every other call, and on the odd calls, return the second cached number` Are you sure it is allowed? In case when you use even gaussian, then save generator seed, exit, restore seed and go back to sampling, you'll get a different answer at the end vs the case when you sample all events in one go – Severin Pappadeux Dec 01 '14 at 19:50
  • Read Update and still failed to see how computation continuation problem could be solved. If you stop, save, restore and continue, it would be basically the same as calling reset, and result would be always different from the case where you sample all events in one go. You have to have capability to save distribution (together with generator) state. Anyway, it is probably beyond the problem under discussion... – Severin Pappadeux Dec 02 '14 at 00:18
  • @SeverinPappadeux: You can save the state of the distribution by inserting it into a stream, and restore it by extracting it from a stream. This process is required to save/restore the state of the cache if it exists. It works the same way for all distributions. – Howard Hinnant Dec 02 '14 at 02:02
1

Some thoughts on top of excellent HH answer

  1. Normal distribution (mu,sigma) is generated from normal (0,1) by shift and scale:

N(mu, sigma) = mu + N(0,1)*sigma

if your mean (mu) is always zero, you could simplify and speed-up (by not adding 0.0) your code by doing something like

normal_distribution<double> distribution;
for (int i(0);i<nStats;i++)
{
  for (int j(0);j<dim_vec;j++)
    {
      l  = distribution(generator);
      y += l*vec2[j]/sqrt(vec1[j]);
    }
  cout << y << endl;
  y=0.0;
}
  1. If speed is of utmost importance, I would try to precompute everything I can outside the main 10^8 loop. Is it possible to precompute sqrt(vec1[j]) so you save on sqrt() call? Is it possible to have vec2[j]/sqrt(vec1[j]) as a single vector?

  2. If it is not possible to precompute those vectors, I would try to save on memory access. Keeping pieces of vec2[j] and vec1[j] together might help with fetching one cache line instead of two. So declare vector<pair<double,double>> vec12(dim_vec); and use in sampling y+=l*vec12[j].first/sqrt(vec12[j].second)

Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64