How to circumvent iteration over an output iterator?

Question

The algorithm I implemented below is the well-known Robert Floyd algorithm that returns M random numbers out of an array of N numbers in total. The algorithm returns a set of elements, but within the algorithm you will need to loop over this result set to check if a previously found element has already been added to the result set before.

It is not possible to loop over the output iterator, because it states in the documentation that an output iterator should only be dereferenced once.

template<typename Iter, typename RandomGenerator>
Iter random_element(Iter start, Iter end, RandomGenerator& g) {
    if (start == end) return start;
    std::uniform_int_distribution<> dis(0, std::distance(start, end) - 1);
    std::advance(start, dis(g));
    return start;
}

template<typename Iter>
Iter random_element(Iter start, Iter end) {
    static std::random_device rd;
    static std::mt19937 gen(rd());
    return random_element(start, end, gen);
}

//! @brief Algorithm of Robert Floyd.
template<typename InputIterator, typename OutputIterator>
OutputIterator random_n(InputIterator first, InputIterator last, OutputIterator result, size_t number) {
    // "misuse" the glibc functions to enforce the notions conform to the documentation
    typedef typename std::iterator_traits<InputIterator>::value_type ValueType;
    __glibcxx_function_requires(_InputIteratorConcept<InputIterator>);
    __glibcxx_function_requires(_OutputIteratorConcept<OutputIterator, ValueType>);
    __glibcxx_requires_valid_range(first1, last1);

    if (first == last) return result;
    if (number == 0) return result;
    assert (number <= (last - first));

    // create container to store distances, not the value itself, neither the iterator values
    std::vector<size_t> distance;
    InputIterator j = last - number + 1;

    // in the case of number=1, j will need to be the end of the array, so full array is searched
    while (j <= last) {
        InputIterator rand_index = random_element(first,j);
        size_t rand = std::distance(first, rand_index);
        if (std::find(distance.begin(), distance.end(), rand) != distance.end()) {
            distance.push_back(std::distance(first,j) - 1);
        } else {
            distance.push_back(rand);
        }
        ++j;
    }
    // fill result container
    for (size_t i = 0; i < distance.size(); ++i) {
        *result = *(first+distance[i]);
        ++result;
    }
    return result;
}

The current solution creates a temporary vector that stores the distances with respect to the iterator first and finally fills the result array in one go, using these distances. It looks ugly to me though. Is there maybe some special iterator construct that is used to cope with the fact that you cannot loop multiple times over an output iterator?

In hindsight this might actually not be _such_ an ugly solution. The function std::find iterates now **not** over a list of **container elements**, so the `==` operator does not need to be implemented for the elements in the container. When using std::find on the `result` directly, this would be the case. — Anne van Rossum, May 10 '13 at 12:53

score 3 · Accepted Answer · answered May 10 '13 at 12:00

3

You can tighten the requirements of your algorithm and require a ForwardIterator to point to the output.

answered May 10 '13 at 12:00

Angew is no longer proud of SO

167,307
17
350
455

Does this mean that all algorithms that require an `OutputIterator` will also work fine with a `ForwardIterator`? And in that case, why do we have an `OutputIterator` at all? Nevermind, that is apparently asked before on [stackoverflow](http://stackoverflow.com/questions/14062297/canonical-way-to-define-forward-output-iterator) – Anne van Rossum May 10 '13 at 12:23
There are 5 iterator categories: Input, Output, Forward, Bidirectional, Random-Access. Any Random-Access iterator is also a Bidirectional iterator. Any Bidirectional is also Forward. Any Forward is also both Input and Output. They are simply increasingly more demanding concepts. [`std::ostream_iterator`](http://en.cppreference.com/w/cpp/iterator/ostream_iterator) is an Output iterator which is not a Forward iterator. – Angew is no longer proud of SO May 10 '13 at 12:27

score 1 · Answer 2 · answered May 10 '13 at 12:08

Your function can do whatever it wants. Then you need to specify to the user how the templated arguments are used.

The name OutputIterator is just an identifier; it doesn't introduce any restrictions or capabilities from the standard. It is a form of documentation, so if you make a second pass and use the iterator as an input, OutputIterator would be a misleading name.

According to the Standard, ForwardIterator requires the multi-pass guarantee, that you can keep a previous value of the iterator and read its referenced object multiple times, and still keep getting the same value, and furthermore that underlying sequence still exists. All this seems necessary and sufficient for your purpose. So you might call the template parameter ForwardIterator. But it's still just a name. Until a more stringent system is implemented, C++ templates use duck typing.

The Standard suggests and names certain common interfaces, but anything goes.

Okay, I will add the restrictions explicitly. My source is [cplusplus.com](http://www.cplusplus.com/reference/iterator/OutputIterator/), I quote: "Algorithms requiring output iterators should be single-pass output algorithms: each iterator position is dereferenced once at most." — Anne van Rossum, May 10 '13 at 12:13
@ondervloei cplusplus.com is *not* a good source. That's backwards; a single-pass output algorithm requires an OutputIterator and a multi-pass algorithm requires (at least) a ForwardIterator. Unless you have a particular iterator type which needs to plug into the function, it's not a requirement at all. — Potatoswatter, May 10 '13 at 12:44

How to circumvent iteration over an output iterator?

2 Answers2