10

Recently I was trying to fix a pretty difficult const-correctness compiler error. It initially manifested as a multi-paragraph template vomit error deep within Boost.Python.

But that's irrelevant: it all boiled down to the following fact: the C++11 std::begin and std::end iterator functions are not overloaded to take R-values.

The definition(s) of std::begin are:

template< class C >
auto begin( C& c ) -> decltype(c.begin());

template< class C >
auto begin( const C& c ) -> decltype(c.begin());

So since there is no R-value/Universal Reference overload, if you pass it an R-value you get a const iterator.

So why do I care? Well, if you ever have some kind of "range" container type, i.e. like a "view", "proxy" or a "slice" or some container type that presents a sub iterator range of another container, it is often very convenient to use R-value semantics and get non-const iterators from temporary slice/range objects. But with std::begin, you're out of luck because std::begin will always return a const-iterator for R-values. This is an old problem which C++03 programmers were often frustrated with back in the day before C++11 gave us R-values - i.e. the problem of temporaries always binding as const.

So, why isn't std::begin defined as:

template <class C>
auto begin(C&& c) -> decltype(c.begin());

This way, if c is constant we get a C::const_iterator and a C::iterator otherwise.

At first, I thought the reason was for safety. If you passed a temporary to std::begin, like so:

auto it = std::begin(std::string("temporary string")); // never do this

...you'd get an invalid iterator. But then I realized this problem still exists with the current implementation. The above code would simply return an invalid const-iterator, which would probably segfault when dereferenced.

So, why is std::begin not defined to take an R-value (or more accurately, a Universal Reference)? Why have two overloads (one for const and one for non-const)?

Siler
  • 8,976
  • 11
  • 64
  • 124
  • 1
    You forgot an `std::forward(c)` there. – Columbo Nov 16 '14 at 17:41
  • 1
    Not sure why that would matter in this case - in this case all that matters is that `c` is `const` or not, an issue which wouldn't be affected after `C&&` degrades to `C&` – Siler Nov 16 '14 at 17:42
  • A container might overload `begin` with ref-qualifiers, making the returned iterator type dependent on the value category of the object argument. But yeah, for demonstrative purposes irrelevant. – Columbo Nov 16 '14 at 17:46
  • @Columbo, true - good point. – Siler Nov 16 '14 at 17:46
  • 2
    Apparently they're not called universal references anymore, but [forwarding references](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4164.pdf). –  Nov 16 '14 at 17:58
  • I am trying to wrap my head around why you are trying to call `begin` on a temporary. The temporary will only last until the semicolon, invalidating the iterator effectively at that point anyway. What is the use case for this? Do slices return iterators to the original object? – Tim Seguine Nov 16 '14 at 18:44
  • @TimSeguine, as I said - if the temporary object is a proxy or "slice" object, that is a totally valid (and often convenient) thing to do. Many libraries provide "proxy" containers, like Boost.Python or Boost.UBLAS. It would be nice to be able to write one-liners that operate on iterator ranges over proxy objects – Siler Nov 17 '14 at 20:56
  • @Siler okay, I haven't seen anything like that before, so I was a little stumped. Seems like a reasonable thing to do, although it still seems somehow morally wrong to take an iterator to a temporary (even if it is a proxy) – Tim Seguine Nov 17 '14 at 20:58

1 Answers1

8

The above code would simply return an invalid const-iterator

Not quite. The iterator will be valid until the end of the full-expression that the temporary the iterator refers to was lexically created in. So something like

std::copy_n( std::begin(std::string("Hallo")), 2,
             std::ostreambuf_iterator<char>(std::cout) );

is still valid code. Of course, in your example, it is invalidated at the end of the statement.

What point would there be in modifying a temporary or xvalue? That is probably one of the questions the designers of the range accessors had in mind when proposing the declarations. They didn't consider "proxy" ranges for which the iterators returned by .begin() and .end() are valid past its lifetime; Perhaps for the very reason that, in template code, they cannot be distinguished from normal ranges - and we certainly don't want to modify temporary non-proxy ranges, since that is pointless and might lead to confusion.

However, you don't need to use std::begin in the first place but could rather declare them with a using-declaration:

using std::begin;
using std::end;

and use ADL. This way you declare a namespace-scope begin and end overload for the types that Boost.Python (o.s.) uses and circumvent the restrictions of std::begin. E.g.

iterator begin(boost_slice&& s) { return s.begin(); }
iterator end  (boost_slice&& s) { return s.end()  ; }

// […]

begin(some_slice) // Calls the global overload, returns non-const iterator

Why have two overloads (one for const and one for non-const)?

Because we still want rvalues objects to be supported (and they cannot be taken by a function parameter of the form T&).

Columbo
  • 60,038
  • 8
  • 155
  • 203