performance of boost multi_index_container

Question

I am interested to know the performance of multi_index_container for the following use case:

struct idx_1 {};
struct idx_2 {};

typedef multi_index_container<
    Object,
    indexed_by<
      // Keyed by: idx1
      hashed_unique<
        tag<idx_1>,
        unique_key >,
      // Keyed by: (attribute1, attribute2 and attribute3)
      ordered_non_unique<
        tag<idx_2>,
        composite_key<
          Object,
          attribute1,
          attribute2,
          attribute3 > >
    >
  > ObjectMap;

I need a map to save the object, and the number of objects should be more than 300,000. while each object has 1 unique key and 3 attributes. The details of the keys:

unique key is "unique" as the name
each attribute only has a few possible values, say there's only 16 combinations. So with 300,000 objects, each combination will have a list of 300,000/16 objects
attribute1 needs to be modified from one value to another value occasionally
object finding is always be done via the unique_key while the composite_key is used to iterating objects with one or several attributes

For such use case, multi_index_container is a very good fit as I don't need to maintain several map independently. For the unique key part, I believe hashed_unique is a good candidate instead of ordered_unique.

But I am extremely not comfortable about the "ordered_non_unique" part. I don't know how's implemented in boost. My guess it boost maintain a list of objects in a single list for each combination similar to the unordered_map(forgive me if it's too naive!). If that's the case, modify the attribute an existing object will be a big pain as it requires to 1) go through a long list of objects for a particular combination 2) execute the equal comparison 3) and move the destination combination.

the steps that I suspect with high latency:

ObjectMap objects_;
auto& by_idx1 = objects_.get<idx1>();
auto it = by_idx1.find(some_unique_key);
Object new_value;
by_idx1.modify(it, [&](const Object& object) {
  object = new_value;
});

My concern is that whether the last "modify" function has some liner behavior as stated to go through some potential long list of objects under one combination...

"modify the attribute an existing object will be a big pain as it requires to 1) go through a long list of objects for a particular combination 2) execute the equal comparison 3) and move the destination combination." - you should describe this more thoroughly... when you want to make the modification, are you saying you will know the attribute[1-3] values but not the unique key, but as you iterate the matches you'll somehow recognise a single value to be modified by comparing some other field(s)? If you want that more efficient, clearly you need an extra index on *that/those* field(s). — Tony Delroy, Oct 23 '14 at 05:48
@TonyD, I use the unique_key to find the object and call the "modify" function of multi_index_container to updates the attributes. I believe the container will automatically re-arrange the layout based on the updated attributes. I am worrying about this "re-arrangement" operation. Edited the question to make it more clear — xiaodong, Oct 23 '14 at 05:59
I think you should just implement it and then measure to see if there's a performance problem... I'd be very surprised. — Tony Delroy, Oct 23 '14 at 06:25

score 0 · Answer 1 · answered Oct 23 '14 at 14:46

0

As this is a very specific piece of code, I'd suggest you benchmark and profile it using a large amount of real-world data.

answered Oct 23 '14 at 14:46

Fabio A. Correa

1,968
1
17
26

score 0 · Answer 2 · answered Oct 23 '14 at 16:30

0

As Fabio comments, your best option is to profile the case and see the outcome. Anyway, an ordered_non_unique index is implemented exactly as a std::multimap, namely via a regular rb-tree with the provision that elements with equivalent keys are allowed to coexist in the container; no lists of equivalent elements or anything. As for modify (for your particular use case replace is a better fit), the following procedure is executed:

Check if the element is in place: O(1).
If not, rearrange: O(log n), which for 300,000 elements amounts to a maximum of 19 element comparisons (not 300,000/16=18,750 as you suggest): these comparisons are done lexicographically on the triple (attribute1, attribute2, attribute3). Is this fast enough or not? Well, that depends on your performance requirements, so only profiling can really tell.

answered Oct 23 '14 at 16:30

Joaquín M López Muñoz

5,243
1
15
20

I don't get the part that average of 300,000/16 elements with the same key, how internally it's saved in either multimap or the ordered_non_unique, because the container cannot save them in a tree like structure since it doesn't know whether the element should go to the left child tree or right child tree since they have the same key value. – xiaodong Oct 24 '14 at 08:53
The algorithms for rb-tree insertion work the same way with unique and non-unique elements; the only added characteristic in the case of equivalent elements is that they end up inserted in such a way that linear traversal of the structure meets all equivalent elements together. For instance if you insert (1,2,4,2,3,5,2,4) the resulting traversal is 1,2,2,2,3,4,4,5. – Joaquín M López Muñoz Oct 24 '14 at 09:25

performance of boost multi_index_container

2 Answers2