0

There are several posts focused on the fast removal of elements given by indices from a vector. This question represents a slightly modified version of the problem.

There is a vector of elements:

std::vector <double> numbers{ 100, 200, 300, 400, 500, 600 };

and the corresponding binary index:

std::vector<bool> idxs{ 0, 1, 0, 1, 0, 1 };   

What is the fastest method of removal of elements with the "zero indices" from the vector? It may contain millions of elements.

I tried experiments with remove_if(), but this is not correct:

numbers.erase(std::remove_if(numbers.begin(), numbers.end(), [](bool b)->bool
    {
        return b == 1;
    }), numbers.end());
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
justik
  • 4,145
  • 6
  • 32
  • 53
  • 1
    `index` seems to be a misnomer here. – jarmod Jun 27 '22 at 16:57
  • Perhaps a vector isn't the correct container type for this use case? What else are you doing with the data? – Some programmer dude Jun 27 '22 at 16:58
  • What exactly is a "binary index"? Your question is not clear. – Mark Ransom Jun 27 '22 at 16:59
  • If you want to remove items from `number` based on the values in `idxs`, you'll have to use `idxs` an keep it in synch with `numbers`. – user4581301 Jun 27 '22 at 17:03
  • @ Some programmer dude: This data I received as a result of some spatial analysis. I would like to use its subset for further computations. The structure is more complicated but works well as an illustrative example. – justik Jun 27 '22 at 17:03
  • 1
    If you're single-threaded, take two indices `from` and `to`, increase `to` in each step and `from` only if `!idxs[from]` in the loop. inside the loop, swap `numbers[from]` with `numbers[to]` if `idxs[from]` && `from != to`. – lorro Jun 27 '22 at 17:08

1 Answers1

3

Unfortunately there is no automatism for this. You simply have to implement a custom erase function yourself:

auto widx = numbers.begin();
for (auto ridx = numbers.cbegin(), auto bidx = idxs.cbegin();
     ridx != numbers.end;
     ++ridx, ++bidx) {
  if (*bidx) *widx++ = *ridx;
}
numbers.erase(widx, numbers.end());
Goswin von Brederlow
  • 11,875
  • 2
  • 24
  • 42
  • Thanks, that's what I was concerned about :-) – justik Jun 27 '22 at 17:34
  • Copying a double needlessly is likely cheaper than paying for the branch. So `*widx = *ridx; widx += static_cast(*bidx);`. But the compiler might figure this out on its own. Benchmark if this is critical. Also if speed, then `std::vector` might be a bottleneck. A vector of `ptrdiff_t` is possibly faster. – bitmask Jun 27 '22 at 19:35