0

I'm fighting a problem with some legacy code trying to talk to modern systems. Specifically, C++11 versions of the STL (bonus points if your proposed solution works with C++03, negative points if solution only works with C++17, but I'm still interested in case I can use that as an argument for upgrading).

My function gets passed an array of void* and three function pointers that are the functions for comparing, copying, and deallocating the data in each void pointers (the void pointers are all the same data type for any given call). In short, I have all the parts for making these void* look like objects, but they are not actually objects.

In my function, I would like to use some libraries for std::set and std::map with this data (also some other libraries in our own code base, but set and map are good starting points). The void* need to be treated like value objects -- i.e., when I do mySet.insert(x), that should allocate a new pointer (I have a way to query for the size of the pointer) and then call the copy function to copy the contents of x into the set.

TL;DR: Anyone know a way to write a custom allocator that will work with std::set for a type whose copy/dealloc instructions are not embodied in a copy constructor?

---------- end of question (remainder is stuff I've already tried) ---------

Obviously, I could package the four things together into a class:

class BundleStuffTogether {
    BundleStuffTogether(void *data, CompareFunc compare, CopyFunc copy, DeallocFunc dealloc);
    // And create the rest of class accordingly, storing the values, 
    // and with destructor calling dealloc, copy constructor calling
    // copy, etc.
};

I would like to avoid allocating that class if I can. I don't want the memory overhead of every entry in the set needing 4x size (a lot of the pointers are to small amounts of data, so the relative size is large).

I'm looking into just using std::set and filling in those two blanks creatively. I can pass the comparison function pointer as the comparison object for the set and map. That's easy. The harder part is the alloc, dealloc, and copy.

I've been trying to write a custom allocator, and I actually got one working when I called it directly in tests, but when I plugged it into the std::set, things fell apart. I did create a class just to hold onto the void* in order to be able to do template specialization of the allocator (details below).

My first trick was to create this class:

class Wrapper {
     public: void *_ptr;
};
// which allows for this, given "void *y":
Wrapper *wrap = static_cast<Wrapper*>(&y);

That gives me a specific type I can use for template specialization. Using that, I tried to create a successful type specialization of std::allocator for Wrapper. That worked for Wrapper by itself, but fell apart when I tried to give the custom specialization additional fields to store the functions -- allocators have to compare as equal.

So then I cloned std::allocator and created my own MyAllocator template class -- exactly the same code as std::allocator -- and then created a template specialization for Wrapper. Then I gave BOTH the main template and my specialization the functions needed to manipulate the void*, so now they compare as equal.

And that was successful as an allocator! I tested a number of variations using the allocators directly, and it worked. But it fell apart when I plugged it into std::set. The set doesn't allocate my class directly. It allocates nodes that contain my class... and node assumes that my class has a copy constructor. sigh I thought the contract with the std::set was that it would use the "construct" method to actually construct the Wrapper object in its own node, but that apparently isn't the case.

So now I'm stuck. C++17 reports that it has even deprecated the "construct" and "destroy" methods that I was betting on, so going forward, it looks like there isn't a way to plug in a custom constructor at all.

Can anyone propose a solution other than the BundleStuffTogether solution I had at the start? My next best idea is to core out std::set itself and rewrite its internals, and I really don't want to go down that road if I can avoid it.

srm
  • 3,062
  • 16
  • 30
  • 1
    I believe this is a good question, but there is a lot of text/information here. If you can trim it down a lot, it would be easier to understand and answer – Justin Apr 11 '18 at 20:05
  • 1
    "I would like to avoid allocating that class if I can". That's **premature optimization**. I don't buy the argument of the class introducing memory overhead. – Cheers and hth. - Alf Apr 11 '18 at 20:08
  • @Justin I spec'd out the problem at the start. The tail is just covering what I've already tried to avoid rehashing those ideas. – srm Apr 11 '18 at 20:08
  • @srm Yes. My suggestion is to trim down on both sections. It's usually a lot easier for readers to understand what you are saying if you make your questions more concise than what you have here. – Justin Apr 11 '18 at 20:10
  • @Cheersandhth.-Alf It's not premature. Additional allocations have been a recurrent problem with this dataset. And I'd still appreciate an answer to the question regardless. – srm Apr 11 '18 at 20:11
  • I agree with @Justin, as explained somewhere on [ask], you should write your question as if you were asking a busy colleague. And trust me, that busy colleague won't let you finish half of the first quarter of the start of your question. – YSC Apr 11 '18 at 20:55
  • _"The void* need to be treated like value objects -- i.e., when I do mySet.insert(x), that should allocate a new pointer"_: unclear! Do you want to allocate a new pointer, or a new pointee? If the latter, how to initialize the new allocated storage from the pointer? With the copy function, right? – YSC Apr 11 '18 at 20:58
  • 1
    Especially when you're working with `void*`, be **very careful** about the difference between a **pointer** and **what it points at**. For example, you certainly do **not** want to "allocate a new pointer" and copy **data** into it. You want to allocate **memory** and copy data into it. – Pete Becker Apr 11 '18 at 21:21
  • @YSC Given void *X, allocate a new block of memory then copy the contents pointed to by X into the new block. – srm Apr 11 '18 at 21:45
  • Sounds like an XY problem. "How do I use an allocator" presumes that the allocator is the problem. – MSalters Apr 12 '18 at 07:11

2 Answers2

3

No. There is no part of the allocator concept that allows a "copy" function. That's a non-starter with the STL. Your only real hope is to allocate a wrapper class. However, you can shrink the sizes if you're crafty, by making each instance have a pointer to a pseudo-type.

struct WrapperType {
  using compare_t = int(void*,void*);
    using copy_t    = void*(void*);
    using free_t    = void(void*);

    compare_t* _compare;
    copy_t*    _copy;
    free_t*    _free;
};
struct Wrapper {
    void*      _data;
    WrapperType* _type;

    explicit Wrapper(void* data, WrapperType* type) noexcept : _data(data), _type(type) {}
    Wrapper(Wrapper&& other) noexcept : _data(other._data), _type(other._type) {}
    Wrapper& operator=(Wrapper&& other) noexcept 
    {reset(); _data=other.release(); _type=other._type; return *this;}
    ~Wrapper() noexcept {reset();}
    void reset() noexcept {_type._free(_data); _data=nullptr;}
    void* release() noexcept {void* data=_data; _data=nullptr; return data;}
    boolean operator<(const Wrapper&other) noexcept {
        assert(_type==other._type);
        return _type._compare(_data, other._data)<0;
    }
};

noexcept is wierdly helpful here, on the move constructor and move assignment operators. With those, the C++ library will usually use them. Otherwise, C++ library will usually prefer the copy versions.

Mooing Duck
  • 64,318
  • 19
  • 100
  • 158
1

First - personally I've avoided custom allocators since they don't feel quite right to me. That's not really a good reason, and you may get an allocator-based answer to your question, but not from me. (See Andrei Alexandrescu's talk on this matter in CppCon 2015: std::allocator is to allocation what std::vector is to vexation).

Second - if you're worried about memory overhead - you should really not be using std::set, which has lots of memory overhead. It's also pretty slow.

But even ignoring that - you can avoid the overhead you mentioned by not keeping a copy of the functions with every set item instance. After all - they're the same for all objects; so you can do one of the following:

  1. Have static variables for the 3 pointers, and make sure to set them before the pointers are used. This should work if you can ensure you never use such objects with different functions at the same time.
  2. Have each instance hold a reference (= a single pointer) to another class, where that other class holds the 3 functions pointers. You'll need to manage the lifetime of that single common instance of the 3-pointer-holding class, but that shouldn't be too bad.
einpoklum
  • 118,144
  • 57
  • 340
  • 684