6

Note: This is not a question whether I should "use list or deque". It's a question about the validity of iterators in the face of insert().


This may be a simple question and I'm just too dense to see the right way to do this. I'm implementing (for better or worse) a network traffic buffer as a std::list<char> buf, and I'm maintaining my current read position as an iterator readpos.

When I add data, I do something like

buf.insert(buf.end(), newdata.begin(), newdata.end());

My question is now, how do I keep the readpos iterator valid? If it points to the middle of the old buf, then it should be fine (by the iterator guarantees for std::list), but typically I may have read and processed all data and I have readpos == buf.end(). After the insertion, I want readpos always to point to the next unread character, which in case of the insertion should be the first inserted one.

Any suggestions? (Short of changing the buffer to a std::deque<char>, which appears to be much better suited to the task, as suggested below.)

Update: From a quick test with GCC4.4 I observe that deque and list behave differently with respect to readpos = buf.end(): After inserting at the end, readpos is broken in a list, but points to the next element in a deque. Is this a standard guarantee?

(According to cplusplus, any deque::insert() invalidated all iterators. That's no good. Maybe using a counter is better than an iterator to track a position in a deque?)

Charles
  • 50,943
  • 13
  • 104
  • 142
Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
  • With the deque, you would always read from the beginning and insert at the end... No need to keep track of a specific readpos. And no, it is not a standard guarantee; in particular, if the dequeue happens to reallocate, the former `end()` will almost certainly refer to freed memory. – Nemo Jun 03 '11 at 17:52
  • Just to mention that with a `deque`, you could keep numerical index of current position instead of iterator. – sbk Jun 03 '11 at 17:54
  • It's best not to rely on testing behavior; it can demonstrate what doesn't work, but doesn't guarantee anything that appears to work. For example an iterator might still be valid if the storage wasn't reallocated, but you can't rely on that. – Mark Ransom Jun 03 '11 at 17:55
  • Thanks all -- yes, I will indeed simply store a numerical read position. Nemo: I'm not in a position to always read at the front. I suppose I could immediately delete the data I read from the front, but there are reasons why I might only want to delete after some processing. – Kerrek SB Jun 03 '11 at 17:59
  • I think your question title should be _"Keeping __std::deque__ iterators valid through insertion"_, because `std::list` iterators remain valid through any insertion. Also if you want to know if you should use a `list`, or `deque`, I say use `vector` and integer indices (as long as the data size is relatively small). Nice dopefish btw. – bobobobo Sep 05 '13 at 20:52
  • @bobobobo: (Thanks.) The question is about the `end()` iterator, for which I couldn't find a standard guarantee that it remains valid. – Kerrek SB Sep 05 '13 at 21:33

4 Answers4

6

From http://www.sgi.com/tech/stl/List.html

"Lists have the important property that insertion and splicing do not invalidate iterators to list elements, and that even removal invalidates only the iterators that point to the elements that are removed."

Therefore, readpos should still be valid after the insert.

However...

std::list< char > is a very inefficient way to solve this problem. Each byte you store in a std::list requires a pointer to keep track of the byte, plus the size of the list node structure, two more pointers usually. That is at least 12 or 24 bytes (32 or 64-bit) of memory used to keep track of a single byte of data.

std::deque< char> is probably a better container for this. Like std::vector it provides constant time insertions at the back however it also provides constant time removal at the front. Finally, like std::vector std::deque is a random-access container so you can use offsets/indexes instead of iterators. These three features make it an efficient choice.

  • Ignoring the stuff about deque, readpos turns out NOT to be valid if `readpos == buf.end()`, so after insertion readpos does not point to the first newly inserted element. – Kerrek SB Jun 03 '11 at 18:05
5
if (readpos == buf.begin())
{
    buf.insert(buf.end(), newdata.begin(), newdata.end());
    readpos = buf.begin();
}
else
{
    --readpos;
    buf.insert(buf.end(), newdata.begin(), newdata.end());
    ++readpos;
}

Not elegant, but it should work.

Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • Wouldn't that explode if buf is empty?? (I currently employ a hack that essentially amounts to what you write, with proper emptiness checks.) – Kerrek SB Jun 03 '11 at 17:41
  • @Kerrek, if buf is empty then `readpos==begin()` should be true. I suppose `begin()` might change after inserting into an empty list (don't know for sure) so I'll fix the code for that case. – Mark Ransom Jun 03 '11 at 17:50
  • Oh, good point! That might actually be a moderately clean way of doing it. – Kerrek SB Jun 03 '11 at 18:07
  • You're talking about insertion into `std::deque` right? Because insertion into `std::list` doesn't invalidate iterators. – bobobobo Sep 05 '13 at 20:50
  • 1
    @bobobobo, but what happens to `end`? Even if it's not invalidated, does it still point to the end of the container or is it now pointing to the newly inserted item? With this code there's no ambiguity. – Mark Ransom Sep 05 '13 at 21:19
0

I was indeed being dense. The standard gives us all the tools we need. Specifically, the sequence container requirements 23.2.3/9 say:

The iterator returned from a.insert(p, i, j) points to the copy of the first element inserted into a, or p if i == j.

Next, the description of list::insert says (23.3.5.4/1):

Does not affect the validity of iterators and references.

So in fact if pos is my current iterator inside the list which is being consumed, I can say:

auto it = buf.insert(buf.end(), newdata.begin(), newdata.end());

if (pos == buf.end()) { pos = it; }

The range of new elements in my list is [it, buf.end()), and the range of yet unprocessed elements is [pos, buf.end()). This works because if pos was equal to buf.end() before the insertion, then it still is after the insertion, since insertion does not invalidate any iterators, not even the end.

Kerrek SB
  • 464,522
  • 92
  • 875
  • 1,084
-1

list<char> is a very inefficient way to store a string. It is probably 10-20 times larger than the string itself, plus you are chasing a pointer for every character...

Have you considered using std::dequeue<char> instead?

[edit]

To answer your actual question, adding and removing elements does not invalidate iterators in a list... But end() is still going to be end(). So you would need to check for that as a special case at the point where you insert the new element in order to update your readpos iterator.

Nemo
  • 70,042
  • 10
  • 116
  • 153
  • As I said, for better or worse, it's not my project and I'm sure there are better ways to do this. A deque is probably good -- I was worried about deletion from the middle of the queue being slow, but I believe that I will only ever need to delete ranges from the front and insert at the back. Is deque efficient for that? Anyway, it'd be a single typedef's change :-) But my question is about iterator validity in the face of insert()! – Kerrek SB Jun 03 '11 at 17:24
  • @Kerrek SB: deque is meant to be used that way (pop and push from the extremes), so I'd say it would be your best choice. At least better than a list – dario_ramos Jun 03 '11 at 17:26
  • Yes, deque is extremely efficient for inserting and removing elements from the beginning or end. (Definitely not in the middle...) – Nemo Jun 03 '11 at 17:29
  • As for your edit: Yes, I noticed that end() gets invalidated, but the question is _how to obtain the correct iterator_ to the next element. – Kerrek SB Jun 03 '11 at 17:44
  • Yeah, I see the problem now. Unlike single-element `insert`, range `insert` does not return an iterator. (That seems like an oversight, but oh well.) You could roll your own version that does, or just stick with your current "hack". – Nemo Jun 03 '11 at 18:00