0

When creating a new iterator pre-C++20 without the help of libraries like boost.iterator, it's necessary to specify the type aliases difference_type, value_type, pointer, reference and iterator_category. According to cppreference, with C++20, it's only necessary to specify difference_type and value_type, which I think is great! But why are there defaults for exactly these 3 aliases?

There are 2 things I don't understand about this (and one thing that seems to me like an oversight):

  1. Why are there no default values for value_type and difference_type? Wouldn't it make sense to use something like std::remove_reference_t<reference> as a default for value_type? As a default for difference_type for random access iterators, it could arguably make sense to use the result type of the - operator taking two iterators.
  2. C++20 adds the contiguous_iterator_tag. Just like with input_iterator_tag versus forward_iterator_tag, I don't see how it should be possible for the compiler to correctly distinguish between a contiguous iterator and a random access iterator, which I guess is why it apparently never selects contiguous_iterator_tag. Is this intended? It also seems somewhat dangerous to misclassify an input iterator as a forward iterator, so why don't require the programmers to specify this alias themselves?
  3. On a somewhat unrelated note, I'm not sure if it's a good idea to silently generate a value for iterator_category even if the programmer has explicitly stated another category, and generating a value for iterator_category that's completely different from the concept seems strange as well. Consider this unrealistic example:
#include <iostream>
#include <iterator>

// With the == operator, this is an input iterator, but nothing else.
struct WeirdIterator {
    // Not an output iterator because you can't assign to a const reference
    const int& operator*() const { return 42; }
    WeirdIterator& operator++() { return *this; } // unimportant
    WeirdIterator operator++(int) { return *this; } // unimportant
    // bool operator==(const WeirdIterator&) const = default;
    using iterator_category = std::random_access_iterator_tag;
    using value_type = int;
    using difference_type = int;
};


void iteratorConcept(std::input_iterator auto) {
    std::cout << "input iterator concept" << std::endl;
}
void iteratorConcept(std::random_access_iterator auto) {
    std::cout << "random access iterator concept" << std::endl;
}

void iteratorTag(std::output_iterator_tag) {
    std::cout << "output iterator tag" << std::endl;
}
void iteratorTag(std::input_iterator_tag) {
    std::cout << "input iterator tag" << std::endl;
}
void iteratorTag(std::random_access_iterator_tag) {
    std::cout << "random access iterator tag" << std::endl;
}

int main() {
    WeirdIterator iter;
    iteratorConcept(iter);
    iteratorTag(std::iterator_traits<WeirdIterator>::iterator_category{});
    return 0;
}

This prints "input iterator concept" and "output iterator tag" because it's missing the comparison operator (which isn't required for the concept). If I add the commented line, this now prints "input iterator concept" and "random access iterator tag", even though it clearly isn't a random access iterator. To be fair, writing the wrong iterator_category (i.e. random_access_iterator_tag) like this is a pretty stupid example, but I still think it would make sense to check if the concept is satisfied, especially in the case of the "fall-back" output_iterator_tag: Forgetting to write the == operator shouldn't turn an input iterator into an unusable output iterator. Would it be possible and make sense to check that the corresponding concepts are satisfied?

Edit A few points in my question seem to be unclear, or maybe I've made some incorrect but unstated assumptions. I'll try to be more explicit about them and rephrase my current understanding (after reading the answer by Nicol Bolas):

  1. Regarding Point 3: As I understand it, it's possible that a type T may have some std::iterator_traits<T>::iterator_category alias even if it doesn't model the corresponding C++20 concept or the C++17 named requirement. This is intended. So, let's forget about this, because it's probably a better fit for a separate question.
  2. I think that the std::type_traits aliases defined if I don't explicitly write them down (e.g. reference when I only write value_type) can be incorrect for some iterators and are meant as sensible default values. Is this correct? If this is incorrect, my question is pretty much answered.
  3. If T::reference isn't defined for an input iterator T, then std::iterator_traits::reference is defined as decltype(*std::declval<T&>()). Is this correct?
  4. If reference can be defined based on operator*, wouldn't it make sense to then also define value_type based on *? Assuming that 5. is correct, the only input iterator I can think of where this would go wrong is the iterator from std::vector<bool>, and there were several proposals to deprecate it because of this difference. So most input iterators would work with this definition, and those that didn't could simply specify value_type. Am I missing something?
  5. Regarding Point 2: It's not in general decidable into what category an iterator falls. Using e.g. an input iterator as if it were a more general forward iterator would be a bug. It can happen that the type_traits::iterator_category of a valid iterator where the programmer did not specify the iterator_category is incorrect. This doesn't affect the concept or named requirement (they take semantics into account), but in practical terms, it's possible that stl functions don't work correctly with this iterator, without generating a (run- or compile-time) error. Therefore, I think it would be a good idea to require the programmer to explicitly state the category. Is there a problem in this reasoning or did miss something?

I hope I don't come across as overly pedantic or as insisting on my personal opinion, but I genuinely don't know if and where there's an error in the points above, and I'm guessing that this isn't just confusing to me.

Irgendwhy
  • 125
  • 7
  • 1
    Given an aribtrary iterator `T` that pops into existence, how do you expect to have its `value_type` defined by default? – Sam Varshavchik Feb 25 '21 at 23:40
  • @Sam Varshavchik Maybe I'm missing something obvious here, but wouldn't `std::remove_reference_t())>` be a good candidate? – Irgendwhy Feb 26 '21 at 00:07
  • `value_type` of a `std::ostreambuf_iterator` is `void`. Yet, its `operator*()` does not return a void. – Sam Varshavchik Feb 26 '21 at 00:38
  • I should've specified I was talking about input iterators (i.e. all iterators that aren't only output iterators) with my default for `value_type`: I think that the default for `reference` is `void` for iterators that are only output iterators but based on `operator*` for all others, so to me, it seems somewhat reasonable to do the same for `value_type`. – Irgendwhy Feb 26 '21 at 13:06

1 Answers1

5

It's important to understand something at this point, as certain different things are being conflated here.

In C++20, there are two classifications of iterators: the old C++17 named requirements, and the new C++20 concept-based iterators. Most of the old requirements map to the latter, but the concept requirements allow for more things to be considered iterators than what the C++17 requirements allowed.

std::iterator_traits however is used for both of them, since they do use many of the same moving parts. The point of this is that it should be possible to write an iterator that fulfills both the C++17 named requirement and the similar C++20 concept. That is, you can write a type that satisfies Cpp17RandomAccessIterator and std::random_access_iterator without too much trouble.

I bring this up because many of the things under discussion will matter a lot more to one set of requirements than the other.

Why are there no default values for value_type and difference_type? Wouldn't it make sense to use something like std::remove_reference_t<reference> as a default for value_type?

Obviously, that would require you to specify reference. So you'd still have to specify two things. value_type is the one that the creator of the iterator is thinking in terms of anyway. And if they're thinking of it, it's probably because reference needs to be something other than a value_type&, so they'll need to specify both anyway.

C++20 adds the contiguous_iterator_tag. Just like with input_iterator_tag versus forward_iterator_tag, I don't see how it should be possible for the compiler to correctly distinguish between a contiguous iterator and a random access iterator, which I guess is why it apparently never selects contiguous_iterator_tag. Is this intended?

In C++17, there was no such thing as a "contiguous iterator". Not in the same sense as a RandomAccessIterator. There's a whole section in the standard that explains the requirements of a RandomAccessIterator, while "contiguous iterator" gets a one paragraph statement with no additional information about it and very few actual uses.

And of course, "contiguous iterator" gets no iterator tag. This was done deliberately to avoid adding another iterator tag and possibly making a lot of code that could work non-functional because a contiguous iterator instead advertised itself as random access.

C++20 changes things. It adds a std::contiguous_iterator_tag, but it does so because std::contiguous_iterator now has syntactical differences from std::random_access_iterator. Namely, a contiguous iterator must permit conversion into a pointer to its value_type via std::to_pointer. This allows you to turn an iterator pair into a pointer pair without having to dereference a potentially non-dereference-able iterator (such as a past-the-end iterator).

Note also that automatic assignment of iterator categories is based on satisfying the C++17 named requirements, not of the C++20 concepts. Since there is no "contiguous iterator" named requirement (and even if there was, it wouldn't be syntactically determinable), there can be no auto assignment of it.

The reason automatic assignment only works for the C++17 requirements is because the C++20 concepts are defined in terms of std::iterator_traits. So it cannot use the concepts without creating a circular definition.

On a somewhat unrelated note, I'm not sure if it's a good idea to silently generate a value for iterator_category even if the programmer has explicitly stated another category

That's not what the standard does. It only provides one if you don't specify one (outside of one odd quirk mentioned below).

This prints "input iterator concept" and "output iterator tag" because it's missing the comparison operator (which isn't required for the concept).

This is an odd quirk of the new definition of iterator_category, but the quirk does ultimately correctly represent the incoherence of your type.

The primary template iterator_category has 3 possible versions, depending on how you defined your iterator type. If your iterator provides all of the member type alises except pointer, then it just uses them. If it only provides some of them, then it does a concept check against an exposition-only version of Cpp17InputIterator. If your type fits that, then it uses your type's iterator_category (and if you don't provide one, then it computes one).

However, if your iterator isn't an input iterator, then it checks against the basic Cpp17Iterator. If that fits, then iterator_traits::iterator_category is fixed to be output_iterator_tag. That is certainly a strange choice.

If I add the commented line, this now prints "input iterator concept" and "random access iterator tag", even though it clearly isn't a random access iterator.

But you said it was a random access iterator. The system isn't supposed to override what you said; that was just a quirk of what happens if your type doesn't match input-iterator but still happens to be some kind of iterator.

In any case, if you lie, you lied. Garbage in, garbage out.

I still think it would make sense to check if the concept is satisfied, especially in the case of the "fall-back" output_iterator_tag: Forgetting to write the == operator shouldn't turn an input iterator into an unusable output iterator.

But... that's what it is. Equality testing isn't optional for input iterators. If you can't test it for equality, then it not an input iterator. Indeed, if the system did as you suggested, that's exactly the tag you would get: an output iterator.

So what's your problem? If you accidentally failed to make your type an input iterator, do you want the system to correctly categorize it as what it is in accord with its behavior or do you want it to forward your mistaken category onward?

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • Thanks for your helpful answer. There are still a few points I don't understand, so I've updated my question. Specifically, I don't understand two things mentioned in this answer: "Obviously, that would require you to specify `reference`." I think the default for `reference` isn't implemented in terms of `value_type`, so it should be possible (but maybe not a good idea?) to add a default for `value_type`. "That's not what the standard does. It only provides one if you don't specify one." I think that `WeirdIterator` is a counter example, but this isn't really relevant to my main question. – Irgendwhy Feb 26 '21 at 13:16
  • @Irgendwhy: You can't just add a half-dozen new questions after receiving an answer. It's one thing to ask for clarification, but most of those are entirely new with little relation to the existing one. Asking multiple questions like this is already frowned upon, so doubling-down on it by asking more is not helping. So I'm going to limit my answer to the initial questions. – Nicol Bolas Feb 26 '21 at 14:33
  • Ah sorry, I'll try to rephrase them as separate questions then – Irgendwhy Feb 26 '21 at 15:19
  • I really disagree with "there's no reason to pick `reference` as the more fundamental one" - `reference` is definitely the fundamental one. It's what `*it` gives you, it's your primary interaction with the iterator. Many algorithms don't even use `value_type` at all. – Barry Feb 26 '21 at 16:01
  • @Barry: Usage and definition are different things. Most algorithms care about `reference`. But the *creator of an iterator* doesn't. More than likely, the thing they care about is `value_type`. And the defaults here are about what the *creator* has to specify, not what the user can use. – Nicol Bolas Feb 26 '21 at 16:13
  • @NicolBolas Could not disagree more with that claim. Creator of an iterator absolutely cares about `reference`! That's what the iterator yields, you cannot _not_ care about that. That doesn't even make sense. `reference` is the sole difference between `iterator` and `const_iterator`. – Barry Feb 26 '21 at 16:16
  • @Barry: When I say "care about", I mean "independently from the one function that *returns it*". The only meaningful interaction that the writer of non-proxy iterator types have with `reference` is when they decide what the return value of `operator*` will be. All other thinking is in terms of `value_type`. And a `const_iterator` has a `const value_type`. – Nicol Bolas Feb 26 '21 at 16:25
  • @NicolBolas No it does not. `value_type` is never `const`. – Barry Feb 26 '21 at 16:29
  • @NicolBolas You mention std::to_pointer in your answer, but since this does not exist, I guess you meant std::to_address. – Philippe Jul 14 '22 at 07:50