How do you determine the size of the nodes created by a 'std::map' for use with 'boost::pool_allocator' (in a cross-platform way)?

Question

UPDATE

Per comments, answer, and additional research, I have come to the conclusion that there is typically no difference between a set and a map in terms of node overhead. My question that follows is really:

How do you determine node overhead for convenient use of boost::pool_allocator as a custom allocator?

And, a further update: The node overhead is probably never going to be more than the size of 4 pointers, so just purging the Boost Pool for sizeof(T), sizeof(T)+sizeof(int), sizeof(T) + 2*sizeof(int), sizeof(T) + 3*sizeof(int) and sizeof(T) + 4*sizeof(int) (or int64_t for 64-bit systems) should be fine. That is what I am actually doing, and it works.

I want to use a boost memory pool to manage tens of millions of tiny, identically-sized objects by avoiding calls to the destructors of these objects, and instead freeing the memory in single swaths containing many instances per swath.

I posted another question about this issue, and the answer to that question has led me to understand that the question I really need to answer is the one I am asking here.

Consider the following code:

class Obj { // ... has an operator<() ... };

typedef std::set<Obj, std::less<Obj>, boost::fast_pool_allocator<Obj>> fast_set_obj;

// Deliberately do not use a managed pointer -
// I will *NOT* delete this object, but instead
// I will manage the memory using the memory pool!!!
fast_set_obj * mset = new fast_set_obj;

// ... add some Obj's to 'mset'
mset->insert(Obj());
mset->insert(Obj());

// Do something desireable with the set ...
...

// All done.
// It's time to release the memory, but do NOT call any Obj destructors.
// The following line of code works exactly as intended.

boost::singleton_pool<boost::fast_pool_allocator_tag, sizeof(Obj const)>::purge_memory();

If you step into the purge_memory() function of the last line of code, you will see that the fast_pool_allocator nicely proceeds to free the memory from the system, just as desired. (No Obj destructors are called, because as noted in the linked question above, the job of the custom allocator is just to allocate and free memory, not to call constructors or destructors.)

It works precisely as desired. Great!

However, here is the problem. If you replace the set with a map, and then try to use the boost::pool_allocator, nothing happens in the call to purge_memory()!

typedef std::map<int, int, std::less<int>,
                 boost::fast_pool_allocator<std::pair<int const, int>>>
        fast_map_obj;

// Ditto above: Deliberately do not use managed pointer
mast_map_obj * mmap = new fast_map_obj;

mmap[5] = Obj();
mmap[6] = Obj();

...

// Uh-oh.  The following line of code DOES NOTHING, because I was using a map, not a set!

boost::singleton_pool<boost::fast_pool_allocator_tag,
                     sizeof(std::pair<int const, int>)>::purge_memory();

As noted, the last line of code does nothing. The reason is that the boost::fast_pool_allocator is a singleton that only responds to, and manages memory for, objects of a given size that is fixed at compile-time. This is why the sizeof argument is used in the expression that calls purge_memory() - it tells the Boost Pool code which of the various different singleton memory pools to purge (assuming the requested one exists due to previously having been instantiated).

Unfortunately, because the memory pool selected to be purged is size-dependent, it is critical that the size of the internal objects managed (i.e., created and destroyed in memory allocated via calls to the custom allocator) is known. Sadly, for std::map, the size of the internal objects managed by the map is neither sizeof(Obj) nor sizeof(std::pair<int const, Obj>).

My question is: How do you rigorously, and in a cross-platform way that works according to the C++11 standard, determine the size of the objects internally managed by std::map for use with boost::fast_pool_allocator?

Is this even possible?

It is kind of weird your example works with std::set, but not with std::map, because internally they use the same node type (_Rb_tree_node in libstdc++; quite probably size rounding is at work in the singleton pool). Unfortunately, this type (and the actual tree member) are private, so you can't look at them from outside. — oakad, Apr 09 '14 at 03:09
All examples in the Boost Pool documentation involve lists and sets - none involve maps. Also, I have read elsewhere that `map` nodes do not stor `pair`s. — Dan Nissenbaum, Apr 09 '14 at 03:40
At least in libstdc++ map (Rb_tree) nodes do store map::value_type as is (so it will be pair). And std::list<> is not really that different from std::set or std::map, the only difference is the size of overhead (2 pointers for list, circa 4 pointers for set and map). — oakad, Apr 09 '14 at 03:48
I don't think it's true that the size of objects allocated internally by `map` when it calls the custom allocator is the size of `std::pair`, while I think it *is* true that the size of objects allocated internally by `set` when it calls the custom allocator is the size of `K const`. — Dan Nissenbaum, Apr 09 '14 at 04:16
Why don't you check the source and see for yourself? stl is open source on all platforms I'm aware of. — oakad, Apr 09 '14 at 04:48
I am doing that right now. It is likely to require hours, if not days, for me to understand, to be able to answer the question. The internals of `std::map` are extremely intricate. Perhaps you can understand it more quickly? — Dan Nissenbaum, Apr 09 '14 at 04:54
The pool pertains to the internal storage of the container. The container uses the allocator's `rebind` member to create a new allocator that works for the types used in the implementation. Note that your `set` or `map` itself will _not_ be allocated in the pool: only the internal storage. — Cory Nelson, Apr 09 '14 at 18:59
@oakad - Apparently, there is a specialization for `std::set` for which the node size is the size of an `int` (see Nevin's answer); but for more complex types, there is additional node overhead. I was wrong (and you're almost certainly right) that there's 'typically' no difference in node storage overhead between a `set` and a `map` - I was fooled because the `set` I was testing with stored `ints`. — Dan Nissenbaum, Apr 10 '14 at 03:17
On your platform they probably went with "sorted array" approach for integral type sets (btw, which toolchain are you using?). From what I understand, this was ok as long as older C++ standards are concerned, but should not be done in C++11 (if I'm not mistaken). — oakad, Apr 10 '14 at 03:33
@oakad - If it works for `int`'s, I would think it would work just as well for any other POD to use a sorted array and avoid the node overhead (the C++11 standard notwithstanding)... I wonder if it tests for the trait `is_pod`? — Dan Nissenbaum, Apr 10 '14 at 03:37
Suddenly: boost/container/flat_set.hpp and boost/container/flat_map.hpp — oakad, Apr 10 '14 at 03:41

score 2 · Answer 1 · answered Apr 09 '14 at 18:54

The problem is that you don't know the internal type which set uses for nodes.

While I haven't figured out how to determine this at compile time, you can write a tracing allocator which prints out the sizeof the node type when allocate is called, as in:

template<typename T>
struct SimpleAllocator : private std::allocator<T>
{
    using value_type = T;
    using pointer = typename std::allocator<T>::pointer;
    using size_type = typename std::allocator<T>::size_type;

    pointer allocate(size_type n)
    {   
        std::cout << "Allocator sizeof(T)==" << sizeof(T) << '\n';
        return std::allocator<T>::allocate(n);
    }   

    void deallocate(pointer p, size_type n)
    { return std::allocator<T>::deallocate(p, n); }
};

And a little test program (I'm testing with sets of ints):

std::set<int, std::less<int>, SimpleAllocator<int>> s;
s.insert(2);

On my system, I get output of:

Allocator sizeof(T)==32

score 2 · Answer 2 · edited Apr 11 '14 at 14:22

There isn't a truly cross platform way to deduce what you are after, since every map implementation is unhappy in its own way, but generally it is the map node that will be allocated in the pool.

This looks different for the different implementations of the standard library, so you code will have to #ifdef-ed for the different versions. With a fragility warning in place, here are the main ones for the g++/clang++/msc compilers and their std libs:

// libstdc++
boost::singleton_pool<boost::fast_pool_allocator_tag,
    sizeof(std::_Rb_tree_node<fast_map_obj::value_type>)>::purge_memory()

// libc++
boost::singleton_pool<boost::fast_pool_allocator_tag,
    sizeof(std::__tree_node<fast_map_obj::value_type, void*>)>::purge_memory()

// msvc 2013 (incl. nov ctp)
boost::singleton_pool<boost::fast_pool_allocator_tag,
    sizeof(fast_map_obj::_Node)>::purge_memory()

Here are some useful links for finding the necessary defines:

http://www.boost.org/doc/libs/1_55_0/boost/config/compiler/

http://sourceforge.net/p/predef/wiki/Compilers/

Might I be right that a good proposal for an enhancement of `boost::pool_allocator` would be for the allocator to determine for itself the correct underlying pool to use, seeing as it is associated with one and only one container of known `K, V`? — Dan Nissenbaum, Apr 11 '14 at 05:27

How do you determine the size of the nodes created by a 'std::map' for use with 'boost::pool_allocator' (in a cross-platform way)?

2 Answers2

Linked