3

Here I have a simple multi_index container and I wonder if there is any way of forcing the multi_index to allocate the elements contiguously in memory. I thought that this would be possible if the main index is random_access.

However this simple example shows that unexpectedly the elements are not contiguous in memory. Is there a combination of boost::multi_index::indexed_by that would likely result in contiguous memory?

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/member.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/random_access_index.hpp>

int main(){
    typedef boost::multi_index_container<
        double,  // simply store doubles
        boost::multi_index::indexed_by<
            boost::multi_index::random_access<>
        >
    > random_access_container;

    random_access_container v; // fill container
    v.reserve(10); // also tried this
    v.push_back(1.);
    v.push_back(2.);
    v.push_back(3.);

    assert( v[0] == 1. ); // ok
    assert( *(&v[0] + 1) == v[1] ); // this fails, memory is not contiguous
}

NOTE 1: I want this for compatibility (so I can take advantage of multi_index container --with other access options--) but also use direct memory access (like it is possible with std::vector).

NOTE 2: I have just found this quote from the documentation, http://www.boost.org/doc/libs/1_61_0/libs/multi_index/doc/reference/rnd_indices.html#rnd_indices , so it is looking difficult.

Except where noted or if the corresponding interface does not exist, random access indices verify the same container requirements as std::vector plus the requirements for std::list specific list operations at [list.ops]. Some of the most important differences with respect to std::vector are:

  1. Random access indices do not provide memory contiguity, and hence do not have data member functions.

    ...

alfC
  • 14,261
  • 4
  • 67
  • 118

1 Answers1

2

No, you don't have memory contiguity. The layout of a random-access index is akin to that of boost::container::stable_vector:

random-access index memory layout

An approximation to contiguous memory can be obtained if you store your elements (of type say T) in a std::vector<T> and then use a multi_index_container of std::ref<T>s. This complicates object lifetime management, of course.

Edit: design rationales

There are a number of reasons why memory contiguity is difficult/hard to include in the design of the library:

  • Iterator stability is provided by all indices, not merely random-access ones. It'd be impossible to keep this (with reasonable performance) if elements were stored contiguously in a chunk of memory.
  • Suppose we have somehow managed to get a random-access index induce memory contiguity with respect to element storage. What would happen if we have two random-access indices? Seems like the first index of the container ought to have a special status in terms of dictating the layout of the whole container for this to hold water.
  • Non-random-access indices are node-based by necessity. This means that each value is stored in a bigger struct with room for additional info (rb-tree pointers etc.) If elements were stored contiguoulsy then either a) it'd be the nodes that'd be stored contiguously, not the values themselves, which seems pretty useless (think what data() would return), or b) the nodes would have to be detached from the values, so that rather than embedding the value in the node we'd have nodes with pointers to the contiguously-stored values, which is a waste of space and does not look like a resonable default decision.
alfC
  • 14,261
  • 4
  • 67
  • 118
Joaquín M López Muñoz
  • 5,243
  • 1
  • 15
  • 20
  • Is that because (iterator) stability was a priority in the design? In principle the reserve function can achieve this but i guess it would have complicated the implementation enormously, no? – alfC Jun 02 '16 at 17:41
  • I mean in otherwords, why `stable_vector` (or `multi_index`) would not reserve contiguous memory (for nodes) if the number of elements is known in advance? (I am trying to see if there is any circumstance in which the memory can be tricked to be contiguous, I am not looking for a guarantee under all conditions.) – alfC Jun 03 '16 at 00:30
  • 1
    I've added some comments to my answer hopefully addressing your questions. – Joaquín M López Muñoz Jun 03 '16 at 06:42
  • Thank you for the comments: 1) ...but perhaps they can be contiguous at least after an initial `reserve` and before the container grows later. 2) actually, I mentally worked on the assumption that the first index was special (and I made it random access for that reason). Finally memory contiguity is a weaker condition than index contiguity, if there are two random access both can be contiguous (but not both index contiguous) after an initial `reserve`. 3) I agree, I keep thinking that just *initially* the memory can be contiguous by default (or at least the nodes can be contiguous)... – alfC Jun 03 '16 at 07:30
  • ...finally, let me say that I deeply appreciate the effort and ingenuity put by you in building such a great library (and before C++11!) – alfC Jun 03 '16 at 07:32
  • 1
    OK, I see what you're driving at. If you want *node* contiguity (as opposed to *value* contiguity) probably you can resort to some pool allocator. Note that random-access index' s `reserve` does not have anything to do with that, as it merely applies to the index's internal pointer array (see diagram). Finally, I fail to see what node contiguity is useful for (specially since the node type is unknown to you as a user). – Joaquín M López Muñoz Jun 03 '16 at 07:40
  • To be honest, I didn't think before that storage was implemented as nodes (I though that, if the supposedly priviledged index was random that the extra structure was somewhere else in memory) so I am making this up as we speak! : The container could provide access to the node size, so in case the nodes are contiguous one can know the stride. But now I am seeing all the difficulty that this implies. Finally the use a uniformly contiguous memory with strides is that (in limited circumpstances) one could pass the elements of a sequence to a legacy C-function (that takes pointers and strides.). – alfC Jun 03 '16 at 08:04
  • @Joaquín M López Muñoz: for us the main use cases for a container are iteration and lookup of a specific element. We mainly use the multi_index to have a container of objects sorted on their id. The multi_index allows heterogeneous lookup which is not yet available in c++ 11. Perhaps the flat_set could be a drop in replacement otherwise. – gast128 Feb 20 '17 at 16:29
  • What if one uses reserve? Would the reserved block (tend to) be contiguous? – alfC Jan 22 '18 at 06:59
  • 1
    @alfC I'm afraind not: `reserve` only affects the index pointer array (the upper vector in the diagram shown in my response), no actual node is really preallocated. – Joaquín M López Muñoz Jan 22 '18 at 16:29
  • @JoaquínMLópezMuñoz I see. So the gain of using reserve is limited compared to vector.reserve. The multi index container *could* in principle take advantage of the reserve, but it doesn't. I guess that will complicate the implementation too much. (?) – alfC Jan 22 '18 at 16:35
  • 1
    @alfC At the end of the day, `multi_index_contaner` is a node-based container, and in this respect behaves pretty much like, say, `std::set`, which doesn't have `reserve` either. You might want to use a pool-based allocator to see if this improves efficiency and/or cache locality. – Joaquín M López Muñoz Jan 22 '18 at 17:18
  • Thank you for the clarifications. Do you think Boost.Pool is a good option that plays well with multi index? http://www.boost.org/doc/libs/1_66_0/libs/pool/doc/html/index.html – alfC Jan 22 '18 at 17:32
  • I don't really know. Boost.Pool certainly interoperates well with Boost.MultiIndex (I remembered having tried in the past), but as for resulting performance you'll have to run your own profiling and see. – Joaquín M López Muñoz Jan 22 '18 at 17:57