8

Let's say I have a std::map<int, std::string> myMap containing the data

1. Red
2. Blue
3. Green
5. Fuchsia
6. Mauve
9. Gamboge
10. Vermillion

and also an std::map<int, std::string>::iterator it pointing at the element

5. Fuchsia

I would like to do something like (making this up)

std::map<int, std::string> myHead = eject(myMap, myMap.begin(), it);

which would result in myMap containing

5. Fuchsia
6. Mauve
9. Gamboge
10. Vermillion

and myHead containing

1. Red
2. Blue
3. Green

I could accomplish this by doing something like

std::map<int, std::string> myHead;
myHead.insert(myMap.begin(), it);
myMap.erase(myMap.begin(), it);

but this seems suboptimal in at least some cases, e.g. if I pick a point such that I'm just splitting off a subtree. (I'll admit that I haven't actually thought through the actual details of the algorithmic complexity here, but if we imagine a case where the value type is extraordinarily expensive to copy then it's clear that the above can't be optimal in general.)

Question: is there a way that I can get std::map to perform this operation in an optimal manner, or do I have to write my own binary search tree where I have access to the internals to accomplish this?

Daniel McLaury
  • 4,047
  • 1
  • 15
  • 37
  • @ildjarn: I am aware of those but I don't see a way to use them to accomplish what I'm describing here. `extract` operates on a single node (without bringing along the nodes underneath it), so it seems like any solution based on that would necessarily be at least O(m), where m is the number of elements in the piece being split off. And I don't see a use for `merge` at all here. – Daniel McLaury Mar 03 '21 at 00:26

3 Answers3

4

If we're speaking asymptotic complexity, you can do this in O(log n) time for most self-balancing tree types, using two operations colloquially known as split and join. There's an extensive Wikipedia article on this.

You can not get this complexity using std::map, you'll need to roll your own or a third-party self-balancing tree implementation. If you need to do this operation often, this is well worth it. The best you can get using the standard library is O(n), which can be many orders of magnitude slower.

You can do it in O(n) in C++11 as:

template<class K, class T, class C, class A>
std::map<K, T, C, A> eject(
    std::map<K, T, C, A>& my_map,
    std::map<K, T, C, A>::iterator begin,
    std::map<K, T, C, A>::iterator end,
) {
    std::map<K, T, C, A> result;
    while (begin != end) {
        auto next = std::next(begin);
        // C++11
        result.insert(result.end(), std::move(*begin));
        my_map.erase(begin);
        // C++17 (avoids move and destruct)
        // result.insert(result.end(), my_map.extract(begin));
        begin = next;
    }
    return result;
}        
orlp
  • 112,504
  • 36
  • 218
  • 315
  • Thanks, this is helpful. Are there any well-known third-party BSTs that implement this functionality? – Daniel McLaury Mar 03 '21 at 00:27
  • @DanielMcLaury I don't know, sorry. – orlp Mar 03 '21 at 00:30
  • No, the std library can split a map in O(size of smaller map), not n lg n. – Yakk - Adam Nevraumont Mar 03 '21 at 01:44
  • Just extract and insert in order at end. Both are amortized constant time operations, on size of smaller container nodes. – Yakk - Adam Nevraumont Mar 03 '21 at 02:34
  • @Yakk-AdamNevraumont Fair enough, I'll edit the answer to `O(n)`, although that still can be many orders of magnitude slower. – orlp Mar 03 '21 at 11:06
  • @orlp You used a danging iterator. Also it needlessly copied keys in [tag:C++17] and did allocations. Both fixed. :) – Yakk - Adam Nevraumont Mar 03 '21 at 14:31
  • @Yakk-AdamNevraumont I rolled back your change. 1. I don't know what a dangling iterator is, but I can assure you `begin` is still a valid iterator after `std::move(*begin)`. 2. No, my version was `O(n)`, both `erase` and `insert` with a correct hint are amortized `O(1)`. – orlp Mar 03 '21 at 14:34
  • @Yakk-AdamNevraumont Ah I misunderstood, `erase` invalidates the iterator, you are right, will fix. The complexity was correct either way, although the C++17 method avoids a move/destruct, so it is preferable. – orlp Mar 03 '21 at 14:40
0

There is extract, but it operates on nodes not node ranges.

And you can efficiently combine maps.

But there is no efficient (faster than O(n)) range based extract.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
0

You can use move iterators, as follows

int main(){
    auto my_map = std::map<int, std::string>{
        {1, "Read"s},
        {2, "Blue"s},
        {3, "Green"s},
        {5, "Fuchsia"s},
        {6, "Mauve"s},
        { 9, "Gamboge"s },
        {10, "Vermillion"s}
    };
    auto it = my_map.begin();
    std::advance(it, 3);
    auto head_map = std::map{
        std::make_move_iterator(my_map.begin()),
        std::make_move_iterator(it)
    };
    auto tail_map = std::map{
        std::make_move_iterator(it),
        std::make_move_iterator(my_map.end())
    };
    std::cout << "The Head\n";
    for (auto [key, value]: head_map){
        std::cout << key << ":" << value << " ";
    }
    std::cout << "\n\nThe Tail\n";
    for (auto [key, value]: tail_map){
        std::cout << key << ":" << value << " ";
    }
}

Demo

asmmo
  • 6,922
  • 1
  • 11
  • 25
  • So, if I understand correctly, this addresses the issue of any potential costs associated to moving the value type, but when we call the constructor to build `tail_map` we're just inserting each element of the tail sequentially? – Daniel McLaury Mar 03 '21 at 00:23
  • @DanielMcLaury No. I think the begin and it iterators are assigned to begin and end iterators of the new map only. – asmmo Mar 03 '21 at 00:30
  • @DanielMcLaury Yeah. u r right I think u need to implement yours. – asmmo Mar 03 '21 at 00:46