5

According to the C++ specification (23.2.4.3), vector::erase() only invalidates "all the iterators and references after the point of the erase"

As such, when using reverse_iterators to pass over all vector members, an erase on the current iterator should not cause the rend() member to be invalidated.

This code will run under G++ but will provide a runtime exception on Windows (VS2010):

#include <vector>

using namespace std;

int main()
{
    vector<int> x;
    x.push_back(1);
    x.push_back(2);
    x.push_back(3);

    //Print
    for(vector<int>::const_iterator i = x.begin(); i != x.end(); ++i)
        printf("%d\n", *i);

    //Delete second node
    for(vector<int>::reverse_iterator r = x.rbegin(); r != x.rend(); ++r)
        if(*r == 2)
            x.erase((r+1).base());

    //Print
    for(vector<int>::const_iterator i = x.begin(); i != x.end(); ++i)
        printf("%d\n", *i);

    return 0;
}

The error is interesting:

Expression: vector iterator not decrementable

Given on the line of the second for loop upon the second run. The decrement refers to the internal "current" iterator member of the reverse_iterator, which is decremented whenever reverse_iterator is incremented.

Can anyone explain this behavior, please?

Thanks.

EDIT

I think this code sample better shows that it's not a problem with r, but rather with rend():

//Delete second node
for(vector<int>::reverse_iterator r = x.rbegin(); r != x.rend();)
{
    vector<int>::reverse_iterator temp = r++;

    if(*temp == 2)
        x.erase((temp+1).base());
}

And errors out with vector iterators incompatible on the for loop upon entry after erase.

Mahmoud Al-Qudsi
  • 28,357
  • 12
  • 85
  • 125

5 Answers5

4

Your program invokes Undefined Behavior. Therefore, neither compiler is incorrect.

According to the Standard, std::vector<int>::reverse_iterator is a typedef for std::reverse_iterator<std::vector<int>::iterator>. The implementation of std::reverse_iterator<Iter> is specified to have a protected member Iter current;, and all other members and functions of reverse_iterator are specified based on the behavior of the member current.

So suppose we have r == reverse_iterator(i), where i is a valid iterator into std::vector<int> x;. Each of these is then guaranteed by the Standard.

r.current == i
(r+1).current == (i-1)
(r+1).base() == (i-1)

On calling x.erase((r+1).base());, all iterators after i-1 are invalidated. Of course, this includes i, and therefore also r.current.

The next thing your program attempts to evaluate is ++r. This expression is specified as having an effect --r.current;. But since r.current was invalidated, this expression is Undefined Behavior; and so is ++r.

aschepler
  • 70,891
  • 9
  • 107
  • 161
  • Thanks for your answer. Just wrt your last paragraph, if you refer to the second code example I posted, ++r is not the problem - it's just the comparison of a valid iterator (r) with an invalidated iterator (rend). – Mahmoud Al-Qudsi Mar 08 '11 at 20:45
  • @Computer Guru: My description applies to the original post. In the edited version, it is `r` which was invalidated. `x.rend()` can't possibly return an invalidated iterator. – aschepler Mar 08 '11 at 21:46
  • OK, thanks. I confirmed this by incrementing r by 2 instead of by 1 each round, and it was *not* invalidated after an erasure in this case. But I can't really use it like that :) – Mahmoud Al-Qudsi Mar 09 '11 at 08:35
3

A reverse_iterator is often just a wrapper around a normal iterator, and so incrementing the reverse iterator probably decrements one underneath. Similarly, rbegin will return an iterator that is after all of the elements in the container, and so it would be invalidated in the same way.

Jeremiah Willcock
  • 30,161
  • 7
  • 76
  • 78
  • Right. But isn't this an incorrect implementation? After all, the C++ spec specifies an iterator that will point to the position before the first element and called rend(). And it specifies that an iterator BEFORE the element being erased shall NOT be invalidated. Thus, rend() must be preserved.... – Mahmoud Al-Qudsi Mar 08 '11 at 17:37
  • You just cannot have an iterator before begin(). There is nothing there! – Bo Persson Mar 08 '11 at 17:40
  • The iterator it is complaining about came from `rbegin`, though; that will start out pointing at the end of the container. – Jeremiah Willcock Mar 08 '11 at 17:40
  • @Bo: rend() is before begin() and end() is after rbegin() for a vector. – Mahmoud Al-Qudsi Mar 08 '11 at 17:41
  • @Jeremiah Willcock: I don't have a problem with the first iteration. It's only the second after the erase that gives me a problem. To better explain, even in a code sample where r is first incremented and then a temporary variable pointing to the old r is deleted, the same issue occurs. – Mahmoud Al-Qudsi Mar 08 '11 at 17:43
  • Yes, as we are talking about reverse_iterators, everything is reversed. :-) When you use `r + 1` on a reverse iterator, its base() actually moves backwards, in the **reverse** direction, and ends up before your position. When you then erase that position, you go boom. A debugging iterator detects that, gcc just doesn't notice. – Bo Persson Mar 08 '11 at 18:01
  • @Computer Guru: `.rend()` is **not** before `begin()`. Reverse iterators are special in that the internal data that refers to the element is `.base()+1`. When you erase `(r+1).base()` you are erasing the element **before** the internal iterator held by `r`. – David Rodríguez - dribeas Mar 08 '11 at 18:06
  • @Jeremiah: The `reverse_iterator` is actually required to be a wrapper around a normal iterator (at least for `std::vector`). See my answer. – aschepler Mar 08 '11 at 18:34
2

Here is a much better way to do what you're trying to do:

struct CustomRemove
{
    bool operator()(int i)
    {
        return (i == 2);
    }
};

int main()
{
    std::vector<int> x;
    x.push_back(1);
    x.push_back(2);
    x.push_back(3);

    CustomRemove custom_remove;

    std::copy(x.begin(), x.end(), std::ostream_iterator<int>(std::cout, "\n"));
    x.erase(std::remove_if(x.begin(), x.end(), custom_remove), x.end());
    std::copy(x.begin(), x.end(), std::ostream_iterator<int>(std::cout, "\n"));

    return 0;
}
Zac Howland
  • 15,777
  • 1
  • 26
  • 42
  • +1 You can make even reduce more code by not using `std::remove_if` but rather `std::remove`: `x.erase( std::remove( x.begin(), x.end(), 2 ), x.end() );` – David Rodríguez - dribeas Mar 08 '11 at 19:09
  • Not really. This is a very stupid example that I posted, in reality I'm iterating over a vector, doing things based on the contents, and only if certain conditions are met, I'm marking the entry for deletion. I need to know the order they're being iterated over in, amongst other things. – Mahmoud Al-Qudsi Mar 08 '11 at 20:47
  • @Computer Guru: The functor can contain state information if you really need that as part of your criteria for deletion. Basically what `remove` and `remove_if` do is maintain the order for the non-deleted items, while pushing the to-be-deleted ones to the back of the container. Then the delete happens at once. Thus, your iterators are never invalidated during the loop. – Zac Howland Mar 09 '11 at 15:31
1

A reverse_iterator internally stores a normal iterator to the position after its current position. It has to do that, because rend() would otherwise have to return something before begin(), which isn't possible. So you end up accidentally invalidating your base() iterator.

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • Well, rend() is by definition "something before begin()" so I don't see the problem? – Mahmoud Al-Qudsi Mar 08 '11 at 17:40
  • @Computer Guru: Iterators were modeled on pointers. Given a C or C++ chunk of allocated memory, you are explicitly allowed to take the address of every byte/char in the memory, and one beyond, although you aren't allowed to dereference it. Therefore, rend() is not actually going to be something before begin(), because that would be undefined behavior. – David Thornley Mar 08 '11 at 17:44
  • 1
    The problem is that rend() stores begin() internally, and rbegin() stores end(). When you use dereference the reverse_iterator, it actually adds a -1 offset to get the element. This works fine, because you can't dereference rend() and rbegin() ends up being end()-1. – Bo Persson Mar 08 '11 at 17:46
  • @ComputerGuru: At the implementation level, he means. – Lightness Races in Orbit Mar 08 '11 at 17:46
  • @David: Right, but I don't want to dereference the iterator/pointer. I'm just comparing to it. I get what's happening, I just don't think it's allowed because nowhere in the spec does it say that rend() can be invalidated after an erase! – Mahmoud Al-Qudsi Mar 08 '11 at 17:50
  • @Computer Guru: You aren't allowed to refer to the address of anything before the start of a block of allocated memory, so you can't compare to it. You are allowed to refer to the address one beyond the end. There's an asymmetry here. – David Thornley Mar 08 '11 at 17:54
  • rend() if formally invalidated when you erase begin(). If the vector is non-empty, you will immediately get another begin() so might not notice. – Bo Persson Mar 08 '11 at 18:05
  • 1
    @David: so would it be fair then to say that `rend()` logically refers to an imaginary point before the start of the vector, just as `end()` refers to an imaginary point after. Arrays provide for the latter being a valid pointer, but not the former. Hence this implementation of reverse iterator is referring to another position in the array, offset from its logical position. The question then is whether the standard anywhere implies that reverse iterators can hold this extra offset-by-1 reference (in which case the code is wrong) or not (the implementation is wrong). – Steve Jessop Mar 08 '11 at 18:09
  • @Steve: The standard says exactly that the relation between an iterator i and a reverse iterator is `&*(reverse_iterator(i)) == &*(i - 1)` – Bo Persson Mar 08 '11 at 18:20
  • 1
    @Bo: that's not what I asked for, though. `operator*` on the reverse iterator could in principle be written so that the reference to the other position is only created when you use it. What I want to know is whether the standard explicitly permits the reverse iterator to have that other reference all along, and hence for the reverse iterator to be invalidated when that other reference is invalidated. Just because the *obvious* implementation of reverse iterators is for them to contain a normal iterator offset by 1, doesn't mean the standard has successfully been worded to permit that. – Steve Jessop Mar 08 '11 at 18:23
  • 1
    I think answering it may require an argument what "corresponding iterator" means in 24.4.1/1 that Bo quotes. It's used in one sentence as though it's a type ("same signatures"), and the next sentence as though each instance of a reverse iterator type has a corresponding normal iterator object (the obvious implementation). If we accept that the latter is the intended meaning, then the existence of a reverse iterator pointing at the end implies the existence of a normal iterator pointing at the beginning, and hence the invalidation is correct and the program is wrong. I think. – Steve Jessop Mar 08 '11 at 18:30
  • 3
    @Steve: It does require that. :-) The value returned by base() is stored in a protected member 'current'. This means that you can derive from reverse_iterator and verify what it has stored, so all implementations must behave the same. – Bo Persson Mar 08 '11 at 18:31
  • @Bo: ah, yes, I should have read further. The implementation of `reverse_iterator` looks to be wholly dictated by the standard, they might as well have filled in the member function bodies in 24.4.1.1. So the programmer can and hence should anticipate the invalidation of reverse iterators pointing one before an erased element. – Steve Jessop Mar 08 '11 at 18:34
0

Since (r+1).base() and r effectively point to the same element, erasing (r+1).base() does in fact invalidate r. Most likely g++ just uses a pointer under the hood so it all appears to work correctly.

If you're trying to remove matching elements from a vector a much better approach is to use remove or remove_if.

Mark B
  • 95,107
  • 10
  • 109
  • 188