4

I'm curious about the rationale behind the following code. For a given map, I can delete a range up to, but not including, end() (obviously,) using the following code:

map<string, int> myMap;
myMap["one"] = 1;
myMap["two"] = 2;
myMap["three"] = 3;

map<string, int>::iterator it = myMap.find("two");

myMap.erase( it, myMap.end() );

This erases the last two items using the range. However, if I used the single iterator version of erase, I half expected passing myMap.end() to result in no action as the iterator was clearly at the end of the collection. This is as distinct from a corrupt or invalid iterator which would clearly lead to undefined behaviour.

However, when I do this:

myMap.erase( myMap.end() );

I simply get a segmentation fault. I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case. Is there some subtle reason for this that I'm missing? I noticed that even this works:

myMap.erase( myMap.end(), myMap.end() );

(i.e. does nothing)

The reason I ask is that I have some code which receives a valid iterator to the collection (but which could be end()) and I wanted to simply pass this into erase rather than having to check first like this:

if ( it != myMap.end() )
    myMap.erase( it );

which seems a bit clunky to me. The alternative is to re code so I can use the by-key-type erase overload but I'd rather not re-write too much if I can help it.

Component 10
  • 10,247
  • 7
  • 47
  • 64
  • @JoachimPileborg: Thanks, that's logical. Passing a valid - non `end()` - iterator twice results in no erase for this reason. Single iterator version still puzzles me though. – Component 10 Sep 14 '12 at 12:20
  • The single-iterator version requires a dereferencable iterator, that is, one that actually refers to an element that can be erased. The past-the-end iterator is not dereferencable, so you get undefined behaviour if you try to erase that non-existent element. – Mike Seymour Sep 14 '12 at 12:23
  • @MikeSeymour: Thanks. Is there a specific reason that it has to be dereferencable? Also, could you tell me where you found this out? None of the online references that I've seen mention this fact. – Component 10 Sep 14 '12 at 12:28
  • 1
    In general, it might be expensive or impossible for the implementation of `erase` to tell whether any particular iterator is dereferencable, so that burden is placed on the caller. It's specified in the C++11 standard, Table 100; it defines the behaviour of `a.erase(q)`, where the preceding paragraph 23.2.3/3 specifies that "`q` denotes a valid dereferenceable const iterator to `a`". – Mike Seymour Sep 14 '12 at 12:33
  • Although, testing whether an iterator *in general* is dereferenceable is a harder problem that testing whether it's equal to `end()`. For most (all?) containers the end iterator is the only valid iterator that's not dereferenceable, the hard part is keeping track of whether the iterator is valid. So the questioner's proposal is less of a burden than a general checked iterator. – Steve Jessop Sep 14 '12 at 13:10
  • "The alternative is to re code so I can use the by-key-type erase overload" - and write what? Not `myMap.erase(it->first)`, because if `it` might be an end iterator then you can't do `->first` on it any more than you can erase it. To avoid the test you'd have to re-code so that the iterator you're passed isn't allowed to point at nothing. – Steve Jessop Sep 14 '12 at 13:14
  • @SteveJessop: Yes, it didn't seem a big deal to me for `map` to check if the iterator equals `end()` but I can see that in context it's a fair call to increase efficiency by requiring that callers do that. To use the by-key-type overload would require a bit of a design change (an ultimately would just move the point at which `end()` was checked for) so I'm going to just stick with checking before I call - life's too short. – Component 10 Sep 14 '12 at 13:23

3 Answers3

5

The key is that in the standard library ranges determined by two iterators are half-opened ranges. In math notation [a,b) They include the first but not the last iterator (if both are the same, the range is empty). At the same time, end() returns an iterator that is one beyond the last element, which perfectly matches the half-open range notation.

When you use the range version of erase it will never try to delete the element referenced by the last iterator. Consider a modified example:

map<int,int> m;
for (int i = 0; i < 5; ++i)
   m[i] = i;
m.erase( m.find(1), m.find(4) );

At the end of the execution the map will hold two keys 0 and 4. Note that the element referred by the second iterator was not erased from the container.

On the other hand, the single iterator operation will erase the element referenced by the iterator. If the code above was changed to:

for (int i = 1; i <= 4; ++i ) 
   m.erase( m.find(i) );

The element with key 4 will be deleted. In your case you will attempt to delete the end iterator that does not refer to a valid object.

I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case.

No, it is not hard to do, but the function was designed with a different contract in mind: the caller must pass in an iterator into an element in the container. Part of the reason for this is that in C++ most of the features are designed so that the incur the minimum cost possible, allowing the user to balance the safety/performance on their side. The user can test the iterator before calling erase, but if that test was inside the library then the user would not be able to opt out of testing when she knows that the iterator is valid.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
2

n3337 23.2.4 Table 102

a.erase( q1, q2)

erases all the elements in the range [q1,q2). Returns q2.

So, iterator returning from map::end() is not in range in case of myMap.erase(myMap.end(), myMap.end());

a.erase(q)

erases the element pointed to by q. Returns an iterator pointing to the element immediately following q prior to the element being erased. If no such element exists, returns a.end().

I wouldn't have thought it difficult for map to check whether the iterator equalled end() and not take action in that case. Is there some subtle reason for this that I'm missing?

Reason is same, that std::vector::operator[] can don't check, that index is in range, of course.

ForEveR
  • 55,233
  • 2
  • 119
  • 133
1

When you use two iterators to specify a range, the range consists of the elements from the element that the first iterator points to up to but not including the element that the second iterator points to. So erase(it, myMap.end()) says to erase everything from it up to but not including end(). You could equally well pass an iterator that points to a "real" element as the second one, and the element that that iterator points to would not be erased.

When you use erase(it) it says to erase the element that it points to. The end() iterator does not point to a valid element, so erase(end()) doesn't do anything sensible. It would be possible for the library to diagnose this situation, and a debugging library will do that, but it imposes a cost on every call to erase to check what the iterator points to. The standard library doesn't impose that cost on users. You're on your own. <g>

Pete Becker
  • 74,985
  • 8
  • 76
  • 165