remove arbitrary list of items from std::vector >

Question

I have a vector of vectors, representing an array. I would like to remove rows efficiently, ie with minimal complexity and allocations

I have thought about building a new vector of vectors, copying only non-deleted rows, using move semantics, like this:

    //std::vector<std::vector<T> > values is the array to remove rows from
    //std::vector<bool> toBeDeleted contains "marked for deletion" flags for each row

    //Count the new number of remaining rows
    unsigned int newNumRows = 0;
    for(unsigned int i=0;i<numRows();i++)
    {
        if(!toBeDeleted[i])
        {
            newNumRows++;
        }
    }


    //Create a new array already sized in rows
    std::vector<std::vector<T> > newValues(newNumRows);

    //Move rows
    for(unsigned int i=0;i<numRows();i++)
    {
        if(!toBeDeleted[i])
        {
            newValues[i] = std::move(values[i]);
        }
    }

    //Set the new array and clear the old one efficiently
    values = std::move(newValues);

Is this the most effective way?

Edit : I just figured that I could avoid allocating a new array by moving rows down iteratively, this could be slightly more efficient and code is much more simple:

    unsigned int newIndex = 0;
    for(unsigned int oldIndex=0;oldIndex<values.size();oldIndex++)
    {
        if(!toBeDeleted[oldIndex])
        {
            if(oldIndex!=newIndex)
            {
                values[newIndex] = std::move(values[oldIndex]);
            }

            newIndex++;
        }
    }
    values.resize(newIndex);

Thanks!

why don't you just use `std::remove_if`? I seriously doubt that your implementation is faster or uses less memory, just profile before rolling your own implementation. If you don't measure, you're just guessing. — PeterT, Apr 10 '14 at 16:40
Well, remove_if takes as argument a function which tells if the item is to be removed based on the item value only. I cannot mark the items themselves, I only have a bool table of the indexes to be deleted. Not so trivial to use remove_if here — galinette, Apr 10 '14 at 17:24
you can do `std::vector vec; std::vector remVec; auto begin = std::begin(vec); auto end = std::end(vec); size_t idx = 0; std::remove_if(begin,end,[&idx,&remVec](const int& ){return remVec[idx++];});`. Although I would suggest not having the flag array in the first place. Instead of setting the flag, consider swaping the element with the last still good element at the end and maintaining an index after which all elements need to be removed. Then all you have to do is to call `resize` to do any actual removing. — PeterT, Apr 10 '14 at 18:48
oh and if has a specific order you could also use a `std::vector,bool>> rows;` to keep track of the flags with the element (I don't know why I keep repeating the `std::` saying `vector,bool>> rows;` wouldn't really confuse anybody). — PeterT, Apr 10 '14 at 18:55

score 2 · Accepted Answer · answered Apr 11 '14 at 20:55

This can be solved using a variation on the usual erase-remove idiom, with a lambda inside the std::remove_if that looks up the index of the current row inside an iterator range of to be removed indices:

#include <algorithm>    // find, remove_if
#include <iostream>
#include <vector>

template<class T>
using M = std::vector<std::vector<T>>; // matrix

template<class T>
std::ostream& operator<<(std::ostream& os, M<T> const& m)
{
    for (auto const& row : m) {
        for (auto const& elem : row)
            os << elem << " ";
        os << "\n";
    }
    return os;
}

template<class T, class IdxIt>
void erase_rows(M<T>& m, IdxIt first, IdxIt last)
{
    m.erase(
        std::remove_if(
            begin(m), end(m), [&](auto& row) {
            auto const row_idx = &row - &m[0];
            return std::find(first, last, row_idx) != last;
        }), 
        end(m)
    );
}

int main()
{
    auto m = M<int> { { 0, 1, 2, 3 }, { 3, 4, 5, 6 }, { 6, 7, 8, 9 }, { 1, 0, 1, 0 } };
    std::cout << m << "\n";

    auto drop = { 1, 3 };
    erase_rows(m, begin(drop), end(drop));

    std::cout << m << "\n";
}

Live Example.

Note: because from C++11 onwards, std::vector has move semantics, shuffling rows around in your std::vector<std::vector<T>> is done using simple pointer manipulations, regardless of your type T (it would be quite different if you want column-deletion, though!).

@Walter it's `std::initializer_list`, for which `std::begin()` and `std::end()` work the same as for the various standard containers. — TemplateRex, Apr 11 '14 at 22:17
@galinette yes, it computes the row index as the difference between addresses of the first and the current row. — TemplateRex, Apr 12 '14 at 12:07
@TemplateRex : row and m[0] are iterators, right? So row_idx is a difference between two pointers on iterators?? — galinette, Apr 12 '14 at 15:13
@galinette they are both pointers, `&m[0]` because it's the address of `m`'s first row, `&row` because it's the address of the element pointed to by the iterator inbsude `std::remove_if`. — TemplateRex, Apr 12 '14 at 16:45

remove arbitrary list of items from std::vector >

1 Answers1