After rethinking the design and with some input from paddy I came up with something like this, but I wonder on the correctness of it, it seems fine when I run it... The idea is that preallocated objects inherit from the following:
struct Node
{
void* pool;
};
That way we inject in every allocated object a pointer to it's pool for later releasing it. Then we have:
template<class T, int thesize>
struct MemPool
{
T* getNext();
void free(T* ptr);
struct ThreadLocalMemPool
{
T* getNextTL();
void freeTL();
int size;
vector<T*> buffer;
vector<int> freeList;
int freeListIdx;
int bufferIdx;
ThreadLocalMemPool* nextTlPool; //within a thread's context a linked list
};
int size;
threadlocal ThreadLocalMemPool* tlPool; //one of these per thread
};
So basically I say MemPool<Cat, 100>
and it gives me a mempool which for every thread that getNexts
it, will instantiate a threadlocal mempool. Sizes I round internally to nearest power of two for easy modulo (which for simplicity ill omit). Because getNext()
is local to each thread, it does not require locking, and I try to use atomics for the freeing part as follows:
T* ThreadLocalMemPool::getNextTL()
{
int iHead = ++bufferIdx % size;
int iTail = freeListIdx % size;
if (iHead != iTail) // If head reaches tail, the free list is empty.
{
int & idx = freeList[iHead];
while (idx == DIRTY) {}
return buffer[idx];
}
else
{
bufferIdx--; //we will recheck next time
if (nextTLPool)
return nextTLPool->getNextTL();
else
//set nextTLPool to a new ThreadLocalMemPool and return getNextTL() from it..
}
}
void ThreadLocalMemPool::free(T* ptr)
{
//the outer struct handles calling this in the right ThreadLocalMemPool
//we compute the index in the pool from which this pool came from by subtracting from
//its address the address of the first pointer in this guys buffer
int idx = computeAsInComment(ptr);
int oldListIdx = atomic_increment_returns_old_value(freeListIdx);
freeList[oldListIdx % size] = idx;
}
Now, the idea is the freeListIdx
will always trail behind the bufferIdx
in a pool because you can't (I assume correct usage)
free more than you have allocated. Calls to free synchronize the order in which they are returning buffer indices to the free list
and the getNext will pick up on this as it cycles back. I have been thinking about it for a bit and don't see anything semantically wrong
with the logic, does it seem sound or is there something subtle which could break it?