4

I need a container class with API as close as possible to std::vector (except no reallocation), but whose elements' storage (and not its member variables such as size) can be specified to be allocated from an existing buffer, so that I can have all vectors' held elements in a contiguous buffer. That is, .end() of one vector points to the same element in the buffer as .front() of the next.

I don't know whether I can simply use a custom allocator with std::vector, because I can't find information on whether that allocates storage for the whole class including the size and pointer data members (in which case I can't use this approach), or just the data elements it holds (in which case I can use it).

I only need an instance's storage to be allocated once, so there's no issue with reallocation. I'm posting here to see if there's already such a container published, rather than reimplementing most of the std vector interface with iterators etc. from scratch.


Update: I unchecked the answer that was posted because it doesn't work in debug mode in Visual C++ 2012. Example with T = float:

template<class T>
inline typename ContigAlloc<T>::pointer ContigAlloc<T>::allocate(std::size_t n)
{
    std::cout << "Alloc " << n << "; type match: " << std::boolalpha << std::is_same<T, float>::value << std::endl;
    return reinterpret_cast<T *>(_buff.alloc(T * sizeof(n)));
}

template<class T>
inline void ContigAlloc<T>::deallocate(T *p, std::size_t n) // TODO: noexcept when VC++2013
{
    std::cout << "Deall " << n << "; type match: " << std::boolalpha << std::is_same<T, float>::value << std::endl;
    _buff.dealloc(p, T * sizeof(n));
}

Test:

std::vector<float, ContigAlloc<float>> vec;
vec.push_back(1.1f);
vec.push_back(1.9f);

Result in Release build is fine:

Alloc 1; type match: true
Alloc 2; type match: true
Deall 1; type match: true
Deall 2; type match: true

Result in Debug build is not fine:

Alloc 1; type match: false
Alloc 1; type match: true
Alloc 2; type match: true
Deall 1; type match: true
Deall 2; type match: true
Deall 1; type match: false

In the first call to allocate(), T = _Container_proxy

Display Name
  • 2,323
  • 1
  • 26
  • 45
  • 1
    So you want some vector-like interface that gives you a *view* into an existing buffer? – Praetorian May 30 '14 at 19:58
  • 1
    Pretty much -- something not unlike Java's ByteBuffer. However, I need it to have most of the type traits of std::vector, since this container will be used by classes that currently take std::vector and use its iterators etc. – Display Name May 30 '14 at 20:08
  • @DisplayName: [Does this link help](http://stackoverflow.com/questions/1466756/c-equivalent-of-java-bytebuffer)? since you mentioned ByteBuffer. – Guvante May 30 '14 at 20:14
  • That's exactly a reason why `vector` have an `Allocator` template parameter. Go read on allocators: http://en.wikipedia.org/wiki/Allocator_(C++) – polkovnikov.ph May 30 '14 at 20:17
  • C++14 will have [`std::dynarray`](http://en.cppreference.com/w/cpp/container/dynarray), which is for the most part the same as `std::vector`, but must be given a definitive size at construction time, and cannot be resized afterwards. (I am aware that this does not solve your problem *right now*...) – isekaijin May 30 '14 at 20:23
  • 1
    @EduardoLeón, C++14 will not have `std::dynarray`. A future TS on Array Extensions will probably have `std::experimental::dynarray`, but not C++14 (the page you linked to even says so) – Jonathan Wakely May 30 '14 at 21:38
  • @JonathanWakely: I stand corrected. – isekaijin May 30 '14 at 21:39
  • how do you want your vector-like class to react, if size would become bigger than capacity? – MikeMB May 30 '14 at 22:34
  • I don't think allocator is correct for this, since the buffer already has data. @MikeMB: `throw std::bad_alloc();` obviously. – Mooing Duck Jun 04 '14 at 00:06
  • I don't follow. The allocator allocates from a new area of the underlying buffer that doesn't have any data. – Display Name Jun 04 '14 at 01:58
  • What i was aiming at is that there are many functions of std::vector that don't make sense when you don't want to allow dynamic growth of the memory region (e.g. push_back, insert etc.). – MikeMB Jun 06 '14 at 08:09

3 Answers3

7

An allocator is used only to allocate storage for the elements. You can use a custom allocator for this purpose.

I stand corrected by Jon in the comments below.

I think one could implement a conforming vector such that it stored everything on the heap except a pointer. The things on the heap would be either 3 pointers, plus the allocator (if not allocator is not optimized away), or 1 pointer, the size, and the capacity (and the possibly optimized away allocator).

In practice, every single implementation of std::vector that has ever shipped in any kind of volume, including:

  • HP
  • SGI
  • libstdc++ (gcc)
  • libc++ (llvm)
  • Dinkumware
  • Microsoft
  • Rogue Wave
  • CodeWarrior
  • STLPort
  • I'm sure I'm forgetting some others...

has placed all of the supporting members within the vector class itself, and used the allocator only for allocating the data. And there seems to be little motivation to do otherwise.

So this is a de facto standard, not an official one. With the history above, it is a pretty safe one.

Note that one could not make the same claim for string, which conceptually has an identical layout. C++11 implementations of string will typically use a "short string" optimization where the allocator is not used at all for "short" strings, but rather the value is embedded within the string class. This optimization is effectively forbidden for vector by 23.2.1 General container requirements [container.requirements.general]/10:

(Unless otherwise specified) no swap() function invalidates any references, pointers, or iterators referring to the elements of the containers being swapped.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577
  • 1
    I was looking at the standard to find a reference to that, but I couldn't. The best I could do was indirect, e.g. certain constructors of `shared_ptr` specifically state that its allocator will be used to allocate internal state while nothing on `vector` says the same. But is that enough of a guarantee? Can you shed some light please? – Jon May 30 '14 at 20:25
  • Can someone confirm before I accept this as an answer? Like @Jon, I couldn't find that information in the standard either. – Display Name May 30 '14 at 20:50
  • 1
    *"I think one could implement a conforming vector such that it stored everything on the heap except a pointer."* If the value type is a UDT, this requires rebinding the allocator for allocating and constructing the size etc on the free store (23.2.1/7 "A copy of this allocator is used for any memory allocation performed, by these constructors and by all member functions, during the lifetime of each container object or until the allocator is replaced.") The non-rebound allocator will only be used for the elements as far as I can tell. – dyp May 30 '14 at 21:26
  • 1
    @dyp: The container isn't required (but is allowed) to store the non-rebound allocator in general. For example `list` only needs to allocate list nodes, and never T, and so it only needs to keep around the allocator rebound to list nodes. The rebound allocator must be rebound from a copy of the user-supplied allocator though. – Howard Hinnant May 30 '14 at 21:38
  • Hmmm not sure how that relates to my observation. I was thinking about `vector`, which must allocate elements of the `value_type` to fulfil the requirements on contiguity of the storage. Hence, we have somewhere a contiguous block of elements. This block must be allocated via the allocator, and cannot contain the "direct data members" like the size of the vector itself, since if those are on the free store, this would require allocating them through a rebound allocator and hence through a second call to the allocation function. – dyp May 30 '14 at 21:49
  • So, the allocator bound to the `value_type` should be able to guarantee contiguity of the storage of the elements for multiple (distinct) `vector` instances - if it's guaranteed that its allocation function is only used to allocate memory for the `vector`'s elements, that's the guarantee I'm looking for indirectly in my previous comments. – dyp May 30 '14 at 21:51
  • 1
    I agree with Howard, you can use a custom allocator to do this, for all known implementations. Just in case you're worried that the code could be used on a previously unknown implementation that uses the allocator for the vector members, make it throw `bad_alloc` if memory is requested for any type except the desired `value_type` (or, not strictly conforming, but will work with all known implementations, delete the `allocate` and `deallocate` members for specializations of the allocator with a `value_type` that isn't the type you want to store) – Jonathan Wakely May 30 '14 at 22:07
  • 1
    @JonathanWakely and Howard Hinnant I just went through VS2013's debug/checked implementation, and they do use a rebound allocator to store the size and data pointer *additionally* on the heap (`_Alloc_proxy` stores a `_Container_proxy` in a buffer allocated via the rebound allocator). – dyp May 30 '14 at 22:29
  • @dyp: Interesting. Can you tell if it is the same buffer as used for the data? Or is it a separately allocated buffer? – Howard Hinnant May 30 '14 at 23:15
  • @HowardHinnant As I said, it uses a rebound allocator. The allocator can only allocate chunks of the size and of the alignment of the value type, so for most types it *has* to rebind the allocator to get a fitting `allocate` function. (The elements are later allocated via the non-rebound allocator for `vector v(10);`.) However, the VS2013 implementation does only default-construct that rebound allocator, so any state passed to the ctor is lost. OTOH I can't find a requirement that a rebound allocator can be copy/move-constructed from the original allocator. – dyp May 31 '14 at 12:03
  • 1
    @dyp, 17.6.3.5 [allocator.requirements], the `X a(b)` and `X a(move(b))` rows require that. On the other hand, there is no requirement that allocators be default constructible, so that's a bug in the VC++ library. – Jonathan Wakely Jun 01 '14 at 12:05
  • 1
    @JonathanWakely Ah, thank you. And via 23.2.1/7 "A copy of this allocator is used for any memory allocation performed", this should then be a bug in the VS2013 implementation. But fixing that should be simple and still allows what the implementation is trying to do (putting copies of the vector data members on the heap). – dyp Jun 01 '14 at 12:11
  • @jonathan-wakely what is the best way to prevent other types from being allocated? Do I modify the rebind declaration? I don't have just one T/value_type to which to specialize the allocator, if that's what you meant. – Display Name Jun 02 '14 at 23:40
  • I tried something like this: `struct rebind { static_assert(std::is_same::value, "Rebinding to other types invalid for this allocator"); typedef ContigAlloc other; };` but this only works in gcc and not MSVC2012. In the latter, it tries to rebind with U = * T. What should I do? – Display Name Jun 03 '14 at 00:22
  • Indeed, with gcc it works even without a rebind declaration. So what do I do for MSVC? Am I safe if I only allow rebinding to * T? Or should I set the rebind to the default allocator if the type is not T, using std::contitional? But even then, the problem still remains as to what to do if the T happens to be the same as one of the class members' types. – Display Name Jun 03 '14 at 00:30
  • For example, this compiles in MSVC2012 if I use the allocator in an std::vector: `struct rebind { typedef typename std::conditional::value, ContigAlloc, std::allocator>::type other; };`. However, I'm not sure if it's a good solution, and how to deal with the case of T being the same type as a member of the container class that's not a data element. As mentioned before, I can't just remove the rebind from the allocator as MSVC2012 won't let me compile. – Display Name Jun 03 '14 at 00:53
  • 1
    @DisplayName, I already suggested throwing `bad_alloc`, or not defining the members except for the specializations you want. I didn't say don't define `rebind` (which is required for all allocators) or make it ill-formed via `static_assert` – Jonathan Wakely Jun 03 '14 at 10:45
  • But to throw `bad_alloc` in allocate() I need to know whether allocate() was called with the original type I passed the allocator to the container with, or a rebound version. How can I propagate that information? The set of types I need to use this data structure with is open ended, I can't just have a few is_same() tests with a predetermined list of types I might use this with. I guess there could be a second, boolean template parameter, that is true by default but set to false in the rebind typedef. But this adds a lot of cruft all over. – Display Name Jun 03 '14 at 18:48
  • Also, what about the conditional typedef I wrote in the other post that returns the default allocator in rebind if the type is different? Is there a problem with going that way? – Display Name Jun 03 '14 at 18:53
  • 1
    Also, what is the fix for the VC++ bug mentioned above by dyp? I specify which underlying storage buffer to use as an allocator parameter, and have made the default ctor private, which doesn't compile in VC2012... What's the workaround? – Display Name Jun 03 '14 at 21:48
  • Throwing bad_alloc doesn't work in debug mode, because VC++2012 _always_ calls allocate with T = _Container_proxy. I'm going to have to un-accept this answer because the problem persists, even if it's due to MSVC bug, and there's not been a workaround posted. – Display Name Jun 03 '14 at 23:32
  • By always I mean the first time a new vector is created. – Display Name Jun 03 '14 at 23:42
  • Doesn't the GCC `std::string` store the capacity and size on the heap as well? I'm pretty sure with my GCC that `sizeof(std::string)==sizeof(char*)`. – Mooing Duck Jun 04 '14 at 00:08
  • @Mooing Duck: i thought it was the other way round: small strings are completely allocated on the stack (small string optimization). – MikeMB Jun 04 '14 at 05:57
  • @MooingDuck: gcc is still using a COW string, which is technically not conforming to C++11. – Howard Hinnant Jun 04 '14 at 14:06
2

If I understand your question correctly, you are using vectors of fixed size. If these sizes and the number of the vectors are compile time constants, I would suggest using std::array.

EDIT: Just to clarify what I mean, here an example:

struct Memory {
    std::array<int, 2> a1; 
    std::array<int, 2> a2;
} memory; 


int main() {         
    std::array<int, 2>& a1 = memory.a1;
    std::array<int, 2>& a2 = memory.a2; 

    a1[0] = 10; 
    a1[1] = 11;  
    a2[0] = 20;
    a2[1] = 21;  

    int *it=&(a1[0]); 

    for (size_t i = 0; i < 4; ++i){
        std::cout << *(it++) << ",";
    }
}

Output: 10,11,20,21, Depending on your requirements, you can also implement Memory as a singleton. Of course it's just a guess from my side, whether this matches your current usage pattern.

MikeMB
  • 20,029
  • 9
  • 57
  • 102
  • My data in general is too large to fit on the stack, and I don't know of any way to have std::array allocated on the heap. – Display Name Jun 01 '14 at 03:36
  • 1
    @Display Name: std::array has no state except of the data its holding, so you can simply use `new` allocate the whole array on the stack. I hope the edit above clarifies what I mean. – MikeMB Jun 01 '14 at 09:57
  • The remaining issue I'd have with `std::array` is that, while I allocate memory once, I don't know the size to allocate until I process a file and determine the size. Thus, I cannot have the arrays as members of the class I need to use this data structure in, as their size is only known once part of the class' constructor has run. My usage pattern with `std::vector` is like this: the vectors are members of class A, and during A's construction, a file is processed, sizes are determined, and `reserve()` is called on the vector members. – Display Name Jun 04 '14 at 01:05
  • @Diaplay Name: Sorry, I got that wrong then. In this case you can't use arrays as their size has to be a compile time constant. If Howard's solution really doesn't work with VS you should probably just write your own class. If you don't have to worry about realocation, booleans and reverse iterators, this should not be too difficult. – MikeMB Jun 04 '14 at 05:53
0

Well, I got it to work in gcc and Visual C++ 2012, so I'm posting in case anyone else hits this issue. I had to add the following in my allocator class:

template<class U>
struct rebind
{
    typedef typename std::conditional<std::is_same<T, U>::value, ContigAlloc<U>, std::allocator<U>>::type other;
}

template<class U>
inline operator std::allocator<U>(void) const
{
    return std::allocator<U>();
}

For Visual C++2012, in Debug builds both the conditional typedef and the conversion operator seem to be needed.

This only works if the default std::allocator is stateless, which I don't think is specified in the standard.

Display Name
  • 2,323
  • 1
  • 26
  • 45