C++ Marking objects for removal in STD list via nullptrs

Question

I was wondering if this is an accaptable practice:

struct Item { };
std::list<std::shared_ptr<Item>> Items;
std::list<std::shared_ptr<Item>> RemovedItems;

void Update()
{
    Items.push_back(std::make_shared<Item>()); // sample item

    for (auto ItemIterator=Items.begin();ItemIterator!=Items.end();ItemIterator++)
    {
        if (true) { // a complex condition, (true) is for demo purposes
            RemovedItems.push_back(std::move(*ItemIterator)); // move ownership
            *ItemIterator=nullptr; // set current item to nullptr
        }

        // One of the downsides, is that we have to always check if
        // the current iterator value is not a nullptr
        if (*ItemIterator!=nullptr) {
            // A complex loop where Items collection could be modified
        }
    }

    // After the loop is done, we can now safely remove our objects

    RemovedItems.clear(); // calls destructors on objects

    //finally clear the items that are nullptr
    Items.erase( std::remove_if( Items.begin(), Items.end(),
        [](const std::shared_ptr<Item>& ItemToCheck){
            return ItemToCheck==nullptr;
    }), Items.end() );
}

The idea here is that we're marking Items container could be effected by outside sources. When an item is removed from the container, it's simply set to nullptr but moved to RemovedItems before that.

Something like an event might effect the Items and add/remove items, so I had to come up with this solution.

Does this seem like a good idea?

should this be something like a garbage collector, which is removing the items later on? what is the advantage of moving it to another list and remove it later on, than remove it directly? — user1810087, Oct 07 '13 at 09:30
@itwasntpete essentially, I always know when to remove my objects (after the loop). — Grapes, Oct 07 '13 at 09:31
When you say "could be effected by outside sources" what do you mean? Is you r aim to remove things *after* the loop, rather than during the loop? Why? — doctorlove, Oct 07 '13 at 09:32
@doctorlove Think of a `Root` -> `Parent` -> `Child` relationship. An event might trigger in a `Child` that could remove `Parent` from `Root`. So the loop might break in the middle and iterator will be invalid. — Grapes, Oct 07 '13 at 09:33
I still don't understand what you're protecting against. Do you mean that destroying one of the `Item` objects could cause other items to be removed from the list? Or do you mean another thread might modify the list whlie you're iterating through it? Your solution protects against the first, but doesn't help if other threads are modifying the list. — Jonathan Wakely, Oct 07 '13 at 12:41

TemplateRex · Accepted Answer · 2013-10-07T13:46:18.277

I think you are complicating things too much. If you are a in multi-threaded situation (you didn't mention it in your question), you would certainly need some locks guarding reads from other threads that access your modified lists. Since there are no concurrent data structures in the Standard Library, you would need to add such stuff yourself.

For single-threaded code, you can simply call the std:list member remove_if with your predicate. There is no need to set pointers to null, store them and do multiple passes over your data.

#include <algorithm>
#include <list>
#include <memory>
#include <iostream>

using Item = int;

int main()
{
    auto lst = std::list< std::shared_ptr<Item> > 
    { 
        std::make_shared<int>(0), 
        std::make_shared<int>(1), 
        std::make_shared<int>(2), 
        std::make_shared<int>(3),         
    };    

    // shared_ptrs to even elements
    auto x0 = *std::next(begin(lst), 0);
    auto x2 = *std::next(begin(lst), 2);

    // erase even numbers
    lst.remove_if([](std::shared_ptr<int> p){
        return *p % 2 == 0;    
    });

    // even numbers have been erased
    for (auto it = begin(lst); it != end(lst); ++it)
        std::cout << **it << ",";    
    std::cout << "\n";

    // shared pointers to even members are still valid
    std::cout << *x0 << "," << *x2;
}

Live Example.

Note that the elements have been really erased from the list, not just put at the end of the list. The latter effect is what the standard algorithm std::remove_if would do, and after which you would have to call the std::list member function erase. This two-step erase-remove idiom looks like this

// move even numbers to the end of the list in an unspecified state
auto res = std::remove_if(begin(lst), end(lst), [](std::shared_ptr<int> p){
    return *p % 2 == 0;    
});

// erase even numbers
lst.erase(res, end(lst));

Live Example.

However, in both cases, the underlying Item elements have not been deleted, since they each still have a shared pointer associated to them. Only if the refence counts would drop to zero, would those former list elements actually be deleted.

Jonathan Wakely · Answer 2 · 2013-10-07T14:31:51.827

If I was reviewing this code I would say it's not acceptable.

What is the purpose of the two-stage removal? An unusual decision like that needs comments explaining its purpose. Despite repeated requests you have failed to explain the point of it.

The idea here is that we're marking Items container could be effected by outside sources.

Do you mean "The idea here is that while we're marking Items container could be effected by outside sources." ? Otherwise that sentence doesn't make sense.

How could it be affected? Your explanation isn't clear:

Think of a Root -> Parent -> Child relationship. An event might trigger in a Child that could remove Parent from Root. So the loop might break in the middle and iterator will be invalid.

That doesn't explain anything, it's far too vague, using very broad terms. Explain what you mean.

A "parent-child relationship" could mean lots of different things. Do you mean the types are related, by inheritance? Objects are related, by ownership? What?

What kind of "event"? Event can mean lots of things, I wish people on StackOverflow would stop using the word "event" to mean specific things and assuming everyone else knows what meaning they intend. Do you mean an asynchronous event, e.g. in another thread? Or do you mean destroying an Item could cause the removal of other elements from the Items list?

If you mean an asynchronous event, your solution completely fails to address the problem. You cannot safely iterate over any standard container if that container can be modidifed at the same time. To make that safe you must do something (e.g. lock a mutex) to ensure exclusive access to the container while modifying it.

Based on this comment:

// A complex loop where Items collection could be modified

I assume you don't mean an asynchronous event (but then why do you say "outside sources" could alter the container) in which case your solution does ensure that iterators remain valid while the "complex loop" iterates over the list, but why do need the actual Item objects to remain valid, rather than just keeping iterators valid? Couldn't you just set the element to nullptr without putting it in RemovedItems, then do Items.remove_if([](shared_ptr<Item> const& p) { return !p; } at the end? You need to explain a bit more about what your "complex loop" can do to the container or to the items.

Why is RemovedItems not a local variable in the Update() function? It doesn't seem to be needed outside that function. Why not use the new C++11 range-based for loop to iterate over the list?

Finally, why is everything named with a capital letter?! Naming local variables and functions with a capital letter is just weird, and if everything is named that way then it's pointless because the capitalisation doesn't help distinguish different types of names (e.g. using a capital letter just for types makes it clear which names are types and which are not ... using it for everything is useless.)

PeterT · Answer 3 · 2013-10-07T14:17:41.150

-2

I feel like this only complicates things a lot by having to check for nullptr everywhere. ~~Also, moving a shared_ptr is a little bit silly.~~

edit:

I think I understand the problem now and this is how I would solve it:

struct Item {
    std::list<std::shared_ptr<Item>> Children;
    std::set < std::shared_ptr<Item>, std::owner_less < std::shared_ptr<Item >> > RemovedItems;
    void Update();
    void Remove(std::shared_ptr<Item>);
};

void Item::Update()
{
    for (auto child : Children){
        if (true) { // a complex condition, (true) is for demo purposes
            RemovedItems.insert(child);
        }
        // A complex loop where children collection could be modified but
        // only by calling Item::remove, Item::add or similar
    }
    auto oless = std::owner_less < std::shared_ptr < Item >>();
    std::sort(Children.begin(), Children.end(), oless ); //to avoid use a set

    auto newEnd = std::set_difference(Children.begin(),
        Children.end(),
        RemovedItems.begin(),
        RemovedItems.end(),
        Children.begin(),
        oless);
    Children.erase(newEnd, Children.end());

    RemovedItems.clear(); // may call destructors on objects

}

void Item::Remove(std::shared_ptr<Item> element){
    RemovedItems.insert(element);
}

edited Oct 07 '13 at 14:17

answered Oct 07 '13 at 09:42

PeterT

7,981
1
26
34

1

this would fail if a complex loop does something like `Items.clear();` Also, what's wrong with std::moving a shared_ptr? If I don't move it, another copy would be created, seems like a waste. – Grapes Oct 07 '13 at 09:56
Well then you could move the loop after the `remove_if`, but there seems to be some constraint which you don't mention. – PeterT Oct 07 '13 at 09:56
I posted this above in comments: Think of a `Root` -> `Parent` -> `Child` relationship. An event might trigger in a `Child` that could remove `Parent` from `Root`. So the loop might break in the middle and iterator will be invalid. – – Grapes Oct 07 '13 at 09:56
When would this event trigger and how is the hierarchy modeled? Also, yes it would copy the shared_ptr but that's the thing, it's basically a pointer. Just like it's not wrong to pass a reference to a pointer but it just seems silly to complicate the code when the advantage of doing so is somewhere between non-existent to minuscule. But that's just my opinion. – PeterT Oct 07 '13 at 10:02
Here is an example: You're loop through all children of `Root`. For each child of `Root`, you call an `Update()` function. Inside the `Update()` function, the child will check if it has any children. And then those children, will also call `Update()`. This keeps going until one of the children clears `Root` of all children. Since `Root` is still in a loop, and now the iterator is invalid there would be an error. – Grapes Oct 07 '13 at 10:07
So, does the `Items` list store all item or just the children? – PeterT Oct 07 '13 at 10:12
Just the children. That's why those children might be effected by outside events. For example: A `Window` contains a `CloseButton` as a child. An event occurs on `CloseButton` to remove `CloseButton` from a `Window`. This event occured while iterating through `Window.Items` – Grapes Oct 07 '13 at 10:14
1

@Grapes I just don't understand how your current version is safe then. Imagine you check for `!=null` immediately after that an even sets it to `=null` and then you try to modify it. If this is multi-threaded then you need locks – PeterT Oct 07 '13 at 10:34
@Grapes I think I understand now what you mean, if elements can only be modified in the complex loop (i.e. if that's where the events are handled) than I'd do it like the update to my answer. Otherwise you're really going to need locks to make this function 100% correctly. – PeterT Oct 07 '13 at 11:25
@PeterT, a `shared_ptr` is *not* "basically a pointer", copying rather than moving performs reference-count updates which need to use atomic operations in a multithreaded program. The cost of memory synchronisation can be significant, so when a move is appropriate using `std::move` instead of copying a `shared_ptr` is good practice. See [GOTW #91](http://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/) – Jonathan Wakely Oct 07 '13 at 13:41
I don't think the solution in this answer is an improvement on the OP's original (even ignoring the use of the non-existent `std::erase` function!) – Jonathan Wakely Oct 07 '13 at 13:59
@JonathanWakely yeah, should've bothered to compile. Anyway, my point was just that it can be done without the marking with a `nullptr` which I disliked style wise. But I agree that it may not be the best alternative – PeterT Oct 07 '13 at 14:04
It can be done without marking as `nullptr` by using `list::splice()` to move elements from one list to another, creating copies of the elements then using `set_difference` does even more unnecessary work. – Jonathan Wakely Oct 07 '13 at 14:18
@JonathanWakely well I thought his requirement was that elements should be able to be marked as removed by functions called in "complex loop here" splicing would invalidate the iterator as far as I know – PeterT Oct 07 '13 at 14:26

C++ Marking objects for removal in STD list via nullptrs

3 Answers3