Avoiding iterator invalidation using indices, maintaining clean interface

Question

I have created a MemoryManager<T> class which is basically a wrapper around two vectors of pointers that manage lifetime of heap-allocated objects.

One vector stores the "alive" objects, the other one stores the object that will be added on next MemoryManager<T>::refresh.

This design was chosen to avoid iterator invalidation when looping over the MemoryManager<T>, as adding a new object directly to the MemoryManager<T>::alive vector can invalidate existing iterators (if it grows in size).

template<typename T> struct MemoryManager {
    std::vector<std::unique_ptr<T>> alive;
    std::vector<T*> toAdd;

    T& create() { 
        auto r(new T); 
        toAdd.push_back(r); 
        return *r; 
    }

    T& refresh() { 
         // Use erase-remove idiom on dead objects
         eraseRemoveIf(alive, [](const std::unique_ptr<T>& p){ return p->alive; });

         // Add all "toAdd" objects and clear the "toAdd" vector
         for(auto i : toAdd) alive.emplace_back(i); 
         toAdd.clear(); 
    }  

    void kill(T& mItem)  { mItem.alive = false; }

    IteratorType begin() { return alive.begin(); }
    IteratorType end()   { return alive.end(); }
}

I use it in my game engine to store entities, and update every "alive" entity every frame:

void game() {
    MemoryManager<Entity> mm;

    while(gameLoop) {
        mm.refresh();
        for(auto e : mm) processEntity(e);
        auto& newEntity = mm.create();
        // do something with newEntity
    }
}

This has allowed me to constantly create/kill entities without having to worry about their lifetime too much.

However, I've recently come to the conclusion that using two std::vector is unnecessary. I could simply use a single vector and store an iterator to the "last alive object", adding the newly create objects immediately after the aforementioned iterator:

Diagram of intended single-vector behavior

The idea, in my mind, works fine... but I cannot actually use a iterator type for end (as shown in the diagram), as it could get invalidated after the addition of some new elements to the vector. I've tested it, and this happens often, causing a crash.

The other solution I can think of is using an index instead of an iterator. This would solve the crashing, but I wouldn't be able to use the cool C++11 for(x : y) foreach loop because MemoryManager<T>::begin and MemoryManager<T>::end need to return an iterator.

Is there a way to achieve the current behavior with a single vector and still maintain a clear interface that can be used with C++11 for-each loops?

@Casey: Whoops, you're right. It should be a vector of `std::unique_ptr`, as destroying the `MemoryManager` does not manually free the memory allocated by `toAdd`'s items. — Vittorio Romeo, Nov 23 '13 at 15:42
Why don't just write `IteratorType end() { return alive.begin() + aliveCount; }` — zch, Nov 28 '13 at 13:41
@zch: Because during a single iteration the `alive.begin()` iterator may get invalidated, when a new entity gets added during an update. — Vittorio Romeo, Nov 28 '13 at 13:46
Hmm but you *do* need a pair of brackets `[]`, even if they're empty. The parentheses `()` are optional, if they're empty, the brackets are not. — dyp, Nov 28 '13 at 22:06

score 9 · Answer 1 · edited Jan 10 '15 at 21:43

One of the simplest ways to get stable iterators (and references) is to use std::list<T>. And unless you are needing T to be a pointer to a polymorphic base class, it is better to use std::list<T>, as opposed to std::list<std::unique_ptr<T>>.

If on the other hand, your Entity is a polymorphic base, then consider using std::vector<std::unique_ptr<T>>. Although you can not depend upon iterators remaining valid, you can depend upon pointers and references to Entity remaining valid with std::vector<std::unique_ptr<T>>.

In your game() example, you never take advantage of stable iterators or pointers. You could just as easily (and more simply) do:

void game() {
    std::vector<Entity> mm;

    while(gameLoop) {
        mm.erase(std::remove_if(mm.begin(), mm.end(), [](const Entity& e)
                                                      { return e.alive; }),
                                                      mm.end());
        for(auto e : mm) processEntity(e);
        mm.push_back(create());
        auto& newEntity = mm.back();
        // do something with newEntity
    }
}

During the processEntity loop, there is no way to invalidate iterators. If you did, you had better not use the range-based-for as the end iterator is only evaluated once, at the beginning of the loop.

But if you really do need stable iterators/references, substituting in std::list<Entity> would be very easy. I would change the erase/remove to use list's member remove_if instead. It will be more efficient.

If you do this, and performance testing (not guessing) indicates you've suffered a performance hit over your existing MemoryManager, you can optimize list by using a "stack allocator" such as the one demonstrated here:

http://howardhinnant.github.io/stack_alloc.html

This allows you to preallocate space (could be on the stack, could be on the heap), and have your container allocate from that. This will be both high performance and cache-friendly until the pre-allocated space is exhausted. And you've still got your iterator/pointer/reference stability.

In summary:

Find out / tell us if unique_ptr<Entity> is actually necessary because Entity is a base class. Prefer container<Entity> over container<unique_ptr<Entity>>.
Do you actually need iterator/pointer/reference stability? Your sample code does not. If you don't actually need it, don't pay for it. Use vector<Entity> (or vector<unique_ptr<Entity>> if you must).
If you actually need container<unique_ptr<Entity>>, can you get away with pointer/reference stability while sacrificing iterator stability? If yes, vector<unique_ptr<Entity>> is the way to go.
If you actually need iterator stability, strongly consider using std::list.
If you use std::list and discover via testing it has performance problems, optimize it with an allocator tuned to your needs.
If all of the above fails, then start designing your own data structure. If you get this far, know that this is the most difficult route, and everything will need to be backed up by both correctness and performance tests.

Jarod42 · Accepted Answer · 2013-11-29T11:17:52.613

You can implement your own iterator class.

Something like the following may help.

template <typename T, typename... Ts>
class IndexIterator : public std::iterator<std::random_access_iterator_tag, T>
{
public:
    IndexIterator(std::vector<T, Ts...>& v, std::size_t index) : v(&v), index(index) {}

    // if needed.
    typename std::vector<T, Ts...>::iterator getRegularIterator() const { return v->begin() + index; }

    T& operator *() const { return v->at(index); }
    T* operator ->() const { return &v->at(index); }

    IndexIterator& operator ++() { ++index; return *this;}
    IndexIterator& operator ++(int) { IndexIterator old(*this); ++*this; return old;}
    IndexIterator& operator +=(std::ptrdiff_t offset) { index += offset; return *this;}
    IndexIterator operator +(std::ptrdiff_t offset) const { IndexIterator res (*this); res += offset; return res;}

    IndexIterator& operator --() { --index; return *this;}
    IndexIterator& operator --(int) { IndexIterator old(*this); --*this; return old;}
    IndexIterator& operator -=(std::ptrdiff_t offset) { index -= offset; return *this;}
    IndexIterator operator -(std::ptrdiff_t offset) const { IndexIterator res (*this); res -= offset; return res;}

    std::ptrdiff_t operator -(const IndexIterator& rhs) const { assert(v == rhs.v); return index - rhs.index; }

    bool operator == (const IndexIterator& rhs) const { assert(v == rhs.v); return index == rhs.index; }
    bool operator != (const IndexIterator& rhs) const { return !(*this == rhs); }

private:
    std::vector<T, Ts...>* v;
    std::size_t index;
};

template <typename T, typename... Ts>
IndexIterator<T, Ts...> IndexIteratorBegin(std::vector<T, Ts...>& v)
{
    return IndexIterator<T, Ts...>(v, 0);
}

template <typename T, typename... Ts>
IndexIterator<T, Ts...> IndexIteratorEnd(std::vector<T, Ts...>& v)
{
    return IndexIterator<T, Ts...>(v, v.size());
}

score 2 · Answer 3 · 2013-11-29T07:53:22.490

2

You could avoid moving elements of the container by maintaining a free-list (see http://www.memorymanagement.org/glossary/f.html#free.list).

To avoid invalidation of references to elements you can use a std::deque if you do not insert or erase in the middle. To avoid invalidation of iterators you can use a std::list.

(Thanks Howard Hinnant)

edited Nov 29 '13 at 07:53

answered Nov 28 '13 at 13:54

As the `MemoryManager` is used for the development of some games, I believe losing `std::vector`'s cache-friendliness would be a bad performance loss – Vittorio Romeo Nov 28 '13 at 14:56
3

@VittorioRomeo: You should probably measure the performance. `vector` *is* cache friendly. But you don't have that. You have `vector>`, which is not so cache-friendly. Have you tried `list` or `forward_list`? These are containers that excel at keeping iterators valid. `deque` is good at maintaining stable references, but only if you don't insert into the middle. But `deque` invalidates iterators more often than even `vector`. – Howard Hinnant Nov 28 '13 at 22:28
@VittorioRomeo Good `deque`(like `boost::container::deque`) is cache friendly - it stores elements contiguously within chunks which sizes are multiples of cache line size. Sequential traversal of `deque` is cache friendly, it is only a little bit slower than `vector` (because requires to check chunks boundaries). Main performance difference is in random access cost - `deque` requires additional memory load per each indexed access. Another performance difference is in total number of allocations (typically `deque` requires O(N) allocations, while `vector` requires O(log N)). – Evgeny Panasyuk Nov 30 '13 at 09:02

score 0 · Answer 4 · answered Nov 23 '13 at 15:41

0

You can implement your own iterator class which handles things the way you prefer. Then your begin() and end() can return instances of that class. For example, your custom iterator could store an integer index and a pointer to the vector itself, thereby making the iterator remain valid even in the face of reallocation.

answered Nov 23 '13 at 15:41

John Zwinck

239,568
38
324
436

Is there any full example of custom iterators working with C++11? I'm having a lot of trouble getting my tests to work, as I use `erase` and `remove_if` and the template errors become very confusing – Vittorio Romeo Nov 23 '13 at 16:22

Avoiding iterator invalidation using indices, maintaining clean interface

4 Answers4

Linked