1

I would like to loop over several random combinations. Currently, I define a vector v with the numbers 1 to n outside the loop, shuffle v inside the loop and define a new vector combination inside the loop.

int k = 50;
int n = 100;
int sampleSize=100;
std::vector<int> v(n);
//std::vector<int> combination(k); //Would it be better to declare here?
std::iota(v.begin(), v.end(), 0);
unsigned seed = 42;

for (int i=0; i<sampleSize; i++) {
    std::shuffle (v.begin(), v.end(), std::default_random_engine(seed));
    std::vector<int> combination(v.begin(), v.begin() + k);
};

It seems weird to me that I define combination again in every iteration of the for loop. Would it make sense to declare combination outside of the for loop and then assign new values to it in every iteration? If so, what would be a good way to assign those new values to combination? So far, I have only used push_back() to append new values to a vector.

Alice Schwarze
  • 559
  • 7
  • 21

1 Answers1

4

There are multiple ways of assigning values in a vector, besides push_back:

  • The [] operator gives you read/write access to an individual element of the vector, such that you can do v[5] = 10. Put this in a for loop, access the elements based on the index of the loop.
  • The = operator copies all the elements from one vector to another.
  • std::copy copies a range of elements.

There are probably much more, these are some of the ways I could think of.

Going back to the initial question, what your loop does now is:

  • Create a new vector, which involves allocating memory for it and copying the elements
  • Release the memory of the vector

This happens at each iteration. Now, even if you declare it outside the loop, you still have to copy the elements (you have to use something like std::copy probably. So the penalty you get is to allocate and release the memory at each iteration.

Technically, it would be more efficient to define it outside the loop. However, the decision on whether to actually place it outside the loop has to consider the tradeoff between the performance improvement you get and the readability penalty you get by defining it outside the loop.

In general, you want the scope of the variables (i.e. the part of the program where the variables can be accessed) to be as small as possible. In this specific case, unless it is a performance-critical section and it makes sense to be this way (from your snippet, is not very clear what you want to do with that std::vector inside the loop) and the vectors are reasonably small such that the memory allocation/releasing is not very slow, I would leave it in the loop.

Paul92
  • 8,827
  • 1
  • 23
  • 37
  • Thanks Paul! I will need to increase `sampleSize` to a few billions, so performance is critical and trumps readability. So I will give std::copy a try. – Alice Schwarze Apr 01 '18 at 06:24