1

I have a contiguous heap of heterogeneous objects that need not preserve the validity of pointers to previously allocated objects upon a call to allocate, i.e:

int* p = heap.allocate(1);
int* q = heap.allocate(1); // p need not point to the int it was initially pointing to

Say we have a heap, with allocated (a) / unallocated (u) memory (for a heap of objects T, size of u and a == sizeof(T)), like so: uuu a u aa uuuuu

If, using the best-fit allocation strategy, we tried to allocate 4 objects contiguously, we would allocate to the unallocated region of size 5 resulting in the heap: uuu a u aa aaaa u

However, in terms of minimising memory fragmentation, this is not ideal. Since the heap need not preserve the validity of the pointers returned by the allocate method, we can move allocated objects within the heap. Taking advantage of this fact, for the heap: uuu a u aa uuuuu, we could move the first allocated object to the beginning of the heap (a uuuu aa uuuuu) and then use the best-fit allocation strategy, resulting in the heap: a aaaa aa uuuuu.

This is much better in terms of minimising memory fragmentation, but for large heaps with many unallocated regions with large allocated regions, it would be very slow to do this form of allocation. (Perhaps not if some sort of cost function was defined, that would force the use of best-fit if doing this form of allocation could take too long?)

What is the best allocation strategy (that need not preserve pointer validity to previously allocated objects), so as to minimise allocation time and memory fragmentation?

edit: I shouldn't be using a heap. I will use slab allocation, due to the fact that when I am allocating, I am not allocating random amounts of objects. I will usually be allocating one object at a time, and if I am not, I will probably be allocating N number of objects multiple times.

fortytoo
  • 452
  • 2
  • 11
  • Just how heterogeneous are these objects? If there are a lot of allocations of the same size, you could take advantage of that fact by creating a separate slab-allocator dedicated to managing all allocations of exactly that size, which would avoid any possibility of fragmentation for those objects. – Jeremy Friesner Apr 15 '22 at 04:36
  • I'm trying to figure out how this is even *possible*. If `allocate` returned some kind of handle object which itself could be converted into a temporary pointer, I could get it. But since it returns a raw pointer, how do you prevent a user from using `p` after it's moved? How do people who already have access to the pointer get its updated value? This all seems impossible from an implementation standpoint. – Nicol Bolas Apr 15 '22 at 04:41
  • Broadly speaking, if you're doing temporary allocations, you would use a simple arena allocator. You basically allocate a slab of memory, start at one end of it, and move the pointer along for each allocation by the size of that allocation. Deallocation is a no-op; at some point, you just drop all of your allocations on the floor and start the allocator over again. This is typically for applications that have loops like videogames, where you have to do temporary work that only lasts within a loop. – Nicol Bolas Apr 15 '22 at 04:45
  • @NicolBolas I am asking for the best allocation strategy (i.e finding where to allocate something). Dealing with the interface for accessing the allocated objects would be handled by the broader heap class, using weird methods and such. There'd be a lot more data structures required to deal with this all, but that has nothing to do with finding the memory location of where to allocate stuff. Moving allocated objects would call some function that would update the interface for accessing that object, somehow. I'll deal with it when I'm implementing it. – fortytoo Apr 15 '22 at 04:48
  • @spitconsumer: "*I am asking for the best allocation strategy*" There is no "best allocation strategy", *especially* without any idea as to what your allocation and deallocation patterns are. You cannot design this stuff in a vacuum; it must be fit for a specific purpose, with a full understanding of what kinds of things you'll be allocating and when, as well as having a specific performance and/or memory usage goal in mind. There is no "best"; if there was, they'd just make malloc/free do it. – Nicol Bolas Apr 15 '22 at 04:53
  • For example, if you do a lot of allocating of a bunch of smaller objects that are not locked for some period of time ("lock" here means "cannot be moved because someone has a pointer to the real memory"), then it would make sense to make allocation cheap, but also to have a process that periodically rearranges the allocated memory. If you allocate and deallocate a bunch of the same object, a pool allocator might be reasonable. And so forth. You cannot know how to make this stuff faster if you don't know how you're allocating and deallocating stuff. – Nicol Bolas Apr 15 '22 at 04:56
  • @NicolBolas I'm not very good with wording... I'm trying to implement a heap-like data structure for an entity component system, storing component instances for some entities. The component instances will be stored for an indeterminate amount of time, and should be relatively local in relation to eachother, hence the contiguousness (that's not a word..) of the heap. Allocation/deallocation can happen at any time, the heap is dynamic, and there exists a compacting garbage collection function, that should be called periodically. – fortytoo Apr 15 '22 at 05:00
  • None of that changes my point. There can be no "best" answer because what is "best" depends on the *specific* details of what you're doing. How important is "relatively local"? How "relatively" should they be? How big are these objects? How often are you allocating/deallocating them; when exactly is "any time"? Etc. – Nicol Bolas Apr 15 '22 at 05:11
  • Allocate objects always in the tail of your arena. If at some point fill ratio of your arena drops down to 50%, compact everything by moving everything to the left. Trying to compute the optimal placement that minimizes the volume of moved data (like you demonstrated in your example) will most likely cost more then global compaction. You can probably do better if you know the sequence of object sizes in advance or at least their distribution. – gudok Apr 15 '22 at 05:18
  • The objects should be small, "relatively local" as in I want to take advantage of cache locality when iterating over allocated objects, and what do you mean by "when exactly is "any time"?"? – fortytoo Apr 15 '22 at 05:20
  • Also I'm voting to close because.. even I don't know what's going on anymore. I'm way too confused. – fortytoo Apr 15 '22 at 05:23
  • Edited because my idea was wrong. Thank you for your advice (AND SORRY FOR BEING DUMB), @NicolBolas – fortytoo Apr 15 '22 at 05:29

0 Answers0