The algorithm I implemented below is the well-known Robert Floyd algorithm that returns M random numbers out of an array of N numbers in total. The algorithm returns a set of elements, but within the algorithm you will need to loop over this result set to check if a previously found element has already been added to the result set before.
It is not possible to loop over the output iterator, because it states in the documentation that an output iterator should only be dereferenced once.
template<typename Iter, typename RandomGenerator>
Iter random_element(Iter start, Iter end, RandomGenerator& g) {
if (start == end) return start;
std::uniform_int_distribution<> dis(0, std::distance(start, end) - 1);
std::advance(start, dis(g));
return start;
}
template<typename Iter>
Iter random_element(Iter start, Iter end) {
static std::random_device rd;
static std::mt19937 gen(rd());
return random_element(start, end, gen);
}
//! @brief Algorithm of Robert Floyd.
template<typename InputIterator, typename OutputIterator>
OutputIterator random_n(InputIterator first, InputIterator last, OutputIterator result, size_t number) {
// "misuse" the glibc functions to enforce the notions conform to the documentation
typedef typename std::iterator_traits<InputIterator>::value_type ValueType;
__glibcxx_function_requires(_InputIteratorConcept<InputIterator>);
__glibcxx_function_requires(_OutputIteratorConcept<OutputIterator, ValueType>);
__glibcxx_requires_valid_range(first1, last1);
if (first == last) return result;
if (number == 0) return result;
assert (number <= (last - first));
// create container to store distances, not the value itself, neither the iterator values
std::vector<size_t> distance;
InputIterator j = last - number + 1;
// in the case of number=1, j will need to be the end of the array, so full array is searched
while (j <= last) {
InputIterator rand_index = random_element(first,j);
size_t rand = std::distance(first, rand_index);
if (std::find(distance.begin(), distance.end(), rand) != distance.end()) {
distance.push_back(std::distance(first,j) - 1);
} else {
distance.push_back(rand);
}
++j;
}
// fill result container
for (size_t i = 0; i < distance.size(); ++i) {
*result = *(first+distance[i]);
++result;
}
return result;
}
The current solution creates a temporary vector that stores the distances with respect to the iterator first
and finally fills the result
array in one go, using these distances. It looks ugly to me though. Is there maybe some special iterator construct that is used to cope with the fact that you cannot loop multiple times over an output iterator?