Boost Multi-Index remove elements using list of indexes

Question

My multiindex looks like this:

using pair = std::pair<int, int>;

struct Hasher{
    std::size_t operator()(const int& k) const{
        return std::hash<int>()(k);
    }
};

using memberFirst = member<
            pair,
            pair::first_type,
            &pair::first
            >;
using memberSecond = member<
            pair,
            pair::second_type,
            &pair::second
            >;

using Container = multi_index_container<
    pair,
    indexed_by<
        random_access<>,
        hashed_non_unique<memberFirst, Hasher>,
        hashed_non_unique<memberSecond, Hasher>
        >
    >;

Container container;

I have std::vector<int> indexes{10, 32, 55, 66};. This vector represent indexes, what I want to remove.

When I remove something I have to preserve insert order. For example:

1,2,3,4,5,6,7,8,9 // remove {2,5,7}
1,3,4,6,8,9

To achieve this I can use:

int index = 0;
auto it = indexes.begin();
container.remove_if([&index, &it](auto& e){
    (void)e; //never use
    if(index++ == *it){
        ++it;
        return true;
    }
    return false;
});

But it is not optimal way to do it, because I know exactly where to start looking for element: conteiner.begin()+indexes[0]; and I don't know how elements are removed:

index was remove and other indexes were moved by one to the left
index was remove and other indexes were move by number of items removed before to the left

So my gool is something like this(pseudo c++ code):

iterator remove_if(iterator begin, iterator end, lambda){
    int moveBy = 0;
    while(begin != end){
        if(lambda(begin.value)){
            ++moveBy;
        }else if(moveBy){
            (begin-moveBy).value = begin.value;
        }
        ++begin;
    }
    return end-moveBy;
}

And I could use it like this:

container.erase(remove_if(container.begin()+indexes[0], container.end(), [..](..){...}), container.end());

Or there is more clever way to do this.

Question: How can I implement it or how can I improve it!

Your "gool" looks like a [`std::[stable_]partition`](https://en.cppreference.com/w/cpp/algorithm/stable_partition) — sehe, Nov 04 '21 at 15:27

Joaquín M López Muñoz · Answer 1 · 2021-11-05T08:36:00.390

You can use relocate to efficiently move target elements to the end of the index and then erase them all in one fell swoop:

Live Coliru Demo

#include <algorithm>
#include <initializer_list>
#include <cassert>

template<typename RandomAccessIndex,typename FwdIterator>
void remove_positions(RandomAccessIndex& i,FwdIterator first,FwdIterator last)
{
  assert(std::is_sorted(first,last));
  assert(last==std::adjacent_find(first,last));
  
  if(first==last)return;
  
  std::size_t n=*first++,s=1;
  for(;first!=last;++first,++s){
    std::size_t m=*first;
    i.relocate(i.begin()+m,i.begin()+n,i.begin()+n+s);
    n=m-s;
  }
  i.relocate(i.end(),i.begin()+n,i.begin()+n+s);
  i.erase(i.end()-s,i.end());
}

template<typename RandomAccessIndex,typename Seq>
void remove_positions(RandomAccessIndex& i,const Seq& seq)
{
  remove_positions(i,seq.begin(),seq.end());
}

template<typename RandomAccessIndex>
void remove_positions(RandomAccessIndex& i,std::initializer_list<std::size_t> seq)
{
  remove_positions(i,seq.begin(),seq.end());
}

// testing

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <boost/multi_index/key.hpp>
#include <iostream>

using namespace boost::multi_index;
using pair=std::pair<int,int>;
using container=multi_index_container<
  pair,
  indexed_by<
    random_access<>,
    hashed_non_unique<key<&pair::first>>,
    hashed_non_unique<key<&pair::second>>
  >
>;

int main()
{
  container c={
    {0,0},{1,1},{2,2},{3,3},{4,4},
    {5,5},{6,6},{7,7},{8,8},{9,9}
  };
  
  remove_positions(c,{2,5,7});
  
  for(const pair& p:c)std::cout<<p.first<<" ";
}

Output

0 1 3 4 6 8 9

Relocate is a nice building block. I would probably never do it this way unless profiler told me it was significantly faster than using an algorithm to do the effective remove_if. It might be I guess. — sehe, Nov 04 '21 at 19:02
Definitely profiling is the way to go. From a theoretical point of view, this is linear time, though — Joaquín M López Muñoz, Nov 04 '21 at 19:19
With sufficient optimization looks like this is just as well: https://godbolt.org/z/9frnaj38d — sehe, Nov 04 '21 at 19:29
Hi, thanks for your time. It's what I was looking for. In the end (after a lot of tests) this approach is still slower then using built-in method remove_if. It's a bit sad that this container doesn't have overloaded this function for specific starting point... If I could somehow use std::remove_if I could make the same result like I wanted to do, without iterating by every element in the container. Erase function is incredible slow for this approach. — Apsik, Nov 28 '21 at 10:47

sehe · Answer 2 · 2021-11-04T15:47:31.373

I'd use standard library algorithms here.

It strikes me that you can use the random_access iterator with std::stable_partition to sort all the removables at the end before erasing in bulk:

using Record = std::pair<int, int>;

struct Hasher : std::hash<int> { };
using memberFirst  = bmi::member<Record, Record::first_type, &Record::first>;
using memberSecond = bmi::member<Record, Record::second_type, &Record::second>;

using Container = bmi::multi_index_container<
    Record,                                          //
    bmi::indexed_by<                                 //
        bmi::random_access<>,                        //
        bmi::hashed_non_unique<memberFirst, Hasher>, //
        bmi::hashed_non_unique<memberSecond, Hasher> //
        >>;

I renamed the value type to Record to avoid confusion with std::pair

void remove_indices(Container& from, std::vector<int> const& removables) {
    std::vector<std::reference_wrapper<Record const>> //
        arrangement(from.begin(), from.end());

    auto pivot = std::stable_partition(
        arrangement.begin(), arrangement.end(), [&](Record const& rec) {
            auto index = from.iterator_to(rec) - from.begin();
            return (end(removables) ==
                    std::find(removables.begin(), removables.end(), index));
        });

    from.rearrange(begin(arrangement));
    auto n_remaining = pivot - arrangement.begin();
    from.erase(from.begin() + n_remaining, from.end());
}

Demo Live On Compiler Explorer:

Before [(1, 11), (2, 22), (3, 33), (4, 44), (5, 55), (6, 66), (7, 77), (8, 88), (9, 99)]
After [(1, 11), (2, 22), (4, 44), (5, 55), (7, 77), (9, 99)]

Simpler And More Efficient?

It strikes me that the complexity largely vanishes if you know that:

indexes are a fixed order
not duplicated

So, let's use that idea as well:

void remove_indices(Container& from, std::set<int, std::greater<> > const& descending) {
    for (auto& index : descending)
        from.erase(from.iterator_to(from[index]));
}

This trades efficiencies and complexities. Use taste and profilers to decide which version to use. See it live again:

Live On Compiler Explorer

#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index_container.hpp>
namespace bmi = boost::multi_index;

using Record = std::pair<int, int>;

struct Hasher : std::hash<int> { };
using memberFirst  = bmi::member<Record, Record::first_type, &Record::first>;
using memberSecond = bmi::member<Record, Record::second_type, &Record::second>;

using Container = bmi::multi_index_container<
    Record,                                          //
    bmi::indexed_by<                                 //
        bmi::random_access<>,                        //
        bmi::hashed_non_unique<memberFirst, Hasher>, //
        bmi::hashed_non_unique<memberSecond, Hasher> //
        >>;

void remove_indices(Container& from, std::set<int, std::greater<> > const& descending) {
    for (auto& index : descending)
        from.erase(from.iterator_to(from[index]));
}

#include <fmt/ranges.h>
int main() {
    Container container{
        {1, 11}, {2, 22}, {3, 33},   {4, 44},   {5, 55},   {6, 66},   {7, 77},
        {8, 88}, {9, 99},
    };

    fmt::print("Before {}\n", container);

    remove_indices(container, {2,5,7});
    fmt::print("After {}\n", container);
}

Prints

Before {(1, 11), (2, 22), (3, 33), (4, 44), (5, 55), (6, 66), (7, 77), (8, 88), (9, 99)}
After {(1, 11), (2, 22), (4, 44), (5, 55), (7, 77), (9, 99)}

Side Note

One of the main advantages of MultiIndex containers is that you don't have to settle for non-descript types like std::pair with nondescript names as first or second_type. Brrr. Much nicer:

Live On Compiler Explorer

#include <boost/multi_index/hashed_index.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/random_access_index.hpp>
#include <boost/multi_index_container.hpp>

#include <fmt/ostream.h>
#include <fmt/ranges.h>
#include <iomanip>

namespace bmi = boost::multi_index;

struct Record {
    int age;
    std::string name;

    friend std::ostream& operator<<(std::ostream& os, Record const& r) {
        return os << "(" << std::quoted(r.name) << "," << r.age << ")";
    }
};

using Container = bmi::multi_index_container<
    Record,
    bmi::indexed_by<
        bmi::random_access<>,
        bmi::hashed_non_unique<bmi::member<Record, std::string, &Record::name>>,
        bmi::hashed_non_unique<bmi::member<Record, int, &Record::age>>
        >>;

void remove_indices(Container&                           from,
                    std::set<int, std::greater<>> const& descending)
{
    for (auto& index : descending)
        from.erase(from.iterator_to(from[index]));
}

int main() {
    Container container{
        {1, "Alyssa"}, {2, "Baya"},   {3, "Chom"},  {4, "Dirk"},  {5, "Evie"},
        {6, "Ferd"},   {7, "Gerald"}, {8, "Homer"}, {9, "Isala"},
    };

    fmt::print("Before {}\n", container);

    remove_indices(container, {7, 2, 5, 7});
    fmt::print("After {}\n", container);
}

Prints

Before {("Alyssa",1), ("Baya",2), ("Chom",3), ("Dirk",4), ("Evie",5), ("Ferd",6), ("Gerald",7), ("Homer",8), ("Isala",9)}
After {("Alyssa",1), ("Baya",2), ("Dirk",4), ("Evie",5), ("Gerald",7), ("Isala",9)}

(first code)And still the iterator move one by one element in`container`. I don't realy care about elements that's I just remove, becouse I will not use it. Only the remaining matter.`std::vectorindexes` are sorded and unique, so first index should be the staring point in container. Why should I use 3 loops for: rearrange`arrangmend`, rearange`container` and erase elements. If in the`container{1,2,3,4,5,6}` I remove`{1,2}` and after that container lookes like this`{3,4,5,6,5,6}` it's still fine because I don't need last 2 indexes I want to remove them`{3,4,5,6}`. Question is can i do it. — Apsik, Nov 04 '21 at 16:54
I started thinking about using `modify`, by looping from: `container.begin()+indexes[0]` to `container.end() or indexes.end()` — Apsik, Nov 04 '21 at 17:02
Note that "If in thecontainer{1,2,3,4,5,6} I remove{1,2}" should result in {1,3,4,5,6}` since `{1,2}` are indices and not values? — sehe, Nov 04 '21 at 17:10
You didn't respond to the second approach. And to "why should I [...]]" - You don't have to do anything. You asked "Or there is more clever way to do this." and I'm trying to help. — sehe, Nov 04 '21 at 17:11
`{1,2,3,4 ...}` are items position before remove in container(indexes), so after removed some, they order is still the same `{2,3,5,7,9,10 ...}` container[0] == [2](starting index) and if I remove it again it can looks like this `{10,45,70 ...}`. container[0] == [10](starting index) — Apsik, Nov 04 '21 at 17:17
I sorry, but there is characters limit.. If I understand correctly second approach do something like this: `container{1,2,3,4,5,6,7}`,`remove{1,2,3}` 1. find position of index; 2. get iterator to it; 3. erase it; 4. move evrythink to left {2,3,4,5,6,7}; 5. back to point 1 if index !=remove.end() — Apsik, Nov 04 '21 at 17:22
The second approach is exactly in the answer, with a live demo. There's nothing "something like this" about it? — sehe, Nov 04 '21 at 17:25
All multi-index containers are nodebased, so "move evrythink to left" never happens. See https://stackoverflow.com/a/59163757/85371. It doesn't matter in which order elements are erased, in terms of performance. The reason to care about the ordering is *ONLY* because the indexes would change meaning — sehe, Nov 04 '21 at 17:26
Isn't it to slow? we have to `!n` times repeat moving elements, when we can do it `n` times. — Apsik, Nov 04 '21 at 17:27
I suggest you read the links I gave you, or just measure. Again, there is **no** "moving elements" due to node-based storage. — sehe, Nov 04 '21 at 17:28
Here's a take that doesn't use `std::set` instead `assert`-ing that the indices to remove are specified in ascending order and strictly unique: https://godbolt.org/z/o5KvYqTE3 — sehe, Nov 04 '21 at 17:35
Oof I copy-paste errored badly there. It should obviously have been a vector there: https://godbolt.org/z/aKv1rxfqv — sehe, Nov 04 '21 at 19:26

Boost Multi-Index remove elements using list of indexes

2 Answers2

Simpler And More Efficient?

Side Note