Why the standard defines borrowed_subrange_t as common_range?

Question

C++20 introduced the ranges::borrowed_range, which defines requirements of a range such that a function can take it by value and return iterators obtained from it without danger of dangling. In short (which refer to P2017R1):

A range is a borrowed_range when you can hold onto its iterators after the range goes out of scope.

At the same time, an type helper borrowed_subrange_t have also been introduced:

template<ranges::range R>
using borrowed_subrange_t = std::conditional_t<
    ranges::borrowed_range<R>,
    ranges::subrange<ranges::iterator_t<R>>, 
    ranges::dangling
>;

which is an alias template that is used by some constrained algorithms such as ranges::unique and ranges::find_end to avoid returning potentially dangling iterators or views.

When type R models borrowed_range, the borrowed_subrange_t of R is basically a subrange<ranges::iterator_t<R>>, which means it also a ranges::common_range, since it only takes one template argument and the second defaults to be the same type as the first one.

But there seems to be some misleading, since there are some subrange types that can be borrowed but still a not common_range, consider the following code:

auto r = views::iota(0);
auto s1 = ranges::subrange{r.begin(),     r.begin() + 5};
auto s2 = ranges::subrange{r.begin() + 5, r.end()};

I create two subranges from a borrowed_range ranges::iota_view, one contains the first 5 elements, and the other contains all the elements of itoa_view starting from the fifth element. They are subranges of itoa_view, and they are obviously, borrowed:

static_assert(ranges::borrowed_range<decltype(s1)>);
static_assert(ranges::borrowed_range<decltype(s2)>);

So to some extent, both of their types can be regarded as the borrowed_subrange_t of the itoa_view type, but according to the definition, only the type of s1 is borrowed_subrange_t of the type r, which also means that the following code is ill-formed since the iota_view r is not an common_range:

auto bsr = ranges::borrowed_subrange_t<decltype(r)>{r}; // ill-formed

Why does the standard need to ensure that borrowed_subrange_t of some range R is a common_range, that is, the return type of begin() and end() are the same? What is the reason behind this? Why not define it more generally like:

template <ranges::range R>
using borrowed_subrange_t = std::conditional_t<
    ranges::borrowed_range<R>,
    ranges::subrange<
      ranges::iterator_t<R>, 
      std::common_iterator<
        ranges::iterator_t<R>,
        ranges::sentinel_t<R>
      >
    >,
    ranges::dangling
>;

Will there be any potential defects and dangers in doing so?

Barry · Accepted Answer · 2021-03-26T15:42:37.113

To quote Alexander Stepanov, in "From Mathematics to Generic Programming":

When writing code, it’s often the case that you end up computing a value that the calling function doesn’t currently need. Later, however, this value may be important when the code is called in a different situation. In this situation, you should obey the law of useful return: A procedure should return all the potentially useful information it computed.

borrowed_subrange is used in algorithms that will necessarily traverse that entire subrange. So we necessarily compute the end iterator of this range as a side-effect of performing the rest of the algorithm. This is useful to the user, so we should return it!

For several of these algorithms, it's not actually even possible to return a sentinel. For instance, ranges::search has to return a subrange that matches - but that subrange need not be at the very end of the initial range, so returning the original sentinel simply isn't an option.

For other algorithms, it might be an option to return a sentinel, but it's a bad one. Consider unique. There are basically three choices here:

Return just the iterator (I) denoting the start of this range (as std::unique does)
Return subrange<I, S> denoting the full range (i.e. just passing through the provided last)
Return subrange<I> denoting the full range, including the computed I referring to last.

But we're already doing the work to be able to do (3), so that's strictly more valuable. There is no reason to do (2).

Consider a less abstract case where we actually have a sentinel. Let's say, we have a null-terminated string:

struct null_terminated_string {
    char const* p;

    struct sentinel {
        auto operator==(char const* p) const { return *p == '\0'; }
    };

    auto begin() const -> char const* { return p; }
    auto end() const -> sentinel { return {}; }
};

Now, what would be a more useful return from unique: one which just gives you back this null_terminated_string::sentinel type or one which gives you back a char const* which points to the null terminator? The latter gives you far more useful information (including, for instance, the size!).

Lastly, this:

template <ranges::range R>
using borrowed_subrange_t = std::conditional_t<
    ranges::borrowed_range<R>,
    ranges::subrange<
      ranges::iterator_t<R>, 
      std::common_iterator<
        ranges::iterator_t<R>,
        ranges::sentinel_t<R>
      >
    >,
    ranges::dangling
>;

Doesn't make sense, since common_iterator<iterator_t<R>, sentinel_t<R>> is not a sentinel for iterator_t<R>. It would be this:

template <ranges::range R>
using borrowed_subrange_t = std::conditional_t<
    ranges::borrowed_range<R>,
    ranges::subrange<ranges::iterator_t<R>, ranges::sentinel_t<R>>,
    ranges::dangling
>;

And that could make sense. Consider ranges::find. Right now, it simply returns an iterator_t<R> (or, more accurately, either an iterator_t<R> or dangling). But a different design of ranges::find could do something different: it could return a subrange starting from that iterator and including the whole rest of the range (arguably this would be more useful). If we wanted to do that for ranges::find, we would definitely want to return a subrange<iterator_t<R>, sentinel_t<R>>. In this case, we haven't traversed the whole range and we don't want to pay the extra cost of doing so; we would simply forward through the sentinel.

It's just that there aren't any algorithms that look like this in <algorithm>, the ones that do simply return the iterator instead of the subrange to the end. Had we had such an algorithm, we would definitely have a version of borrowed_subrange that used sentinel_t<R>. But with the algorithms that we have, there's no need for such a thing.

Caleth · Answer 2 · 2021-03-26T15:03:07.887

2

Why does the standard need to ensure that borrowed_subrange_t of some range R is a common_range, that is, the return type of begin() and end() are the same?

Not all subranges end at the sentinel value of the underlying range.

Will there be any potential defects and dangers in doing so?

If the underlying range has an empty type as it's sentinel, all subranges would end at the sentinel, not at their desired end.

edited Mar 26 '21 at 15:03

answered Mar 26 '21 at 13:14

Caleth

52,200
2
44
75

"*If the underlying range had a sentinel type, all subranges would end at the sentinel, not at their desired end.*" - why? What's stopping some ranges from having their own sentinel that first checks the underlying range's sentinel in addition to providing it's own means of signaling the end of the range? – Fureeish Mar 26 '21 at 14:40
@Fureeish I mean if the result of `end` is a type distinct to the result of `begin`, those tend to be empty types – Caleth Mar 26 '21 at 14:55
Well, "*tend"* is very much different from "*have to*". This distinction is precisely what I am asking about, since it seems to be a fundamental argument to half of your answer. – Fureeish Mar 26 '21 at 15:01
@Fureeish I don't know of any sentinel type that is not the iterator type but has any data members, but such a thing is not invalid – Caleth Mar 26 '21 at 15:02

Why the standard defines borrowed_subrange_t as common_range?

2 Answers2