I have a very large (fixed at runtime, around 10 - 30 million) number of arrays. Each array is of between 0 and 128 elements that are each 6 bytes.
I need to store all the arrays in mmap'ed memory (so I can't use malloc), and the arrays need to be able to grow dynamically (up to 128 elements, and the arrays never shrink).
I implemented a naive approach of having an int array representing the state of each block of 6 bytes in the mmap'ed memory. A value of 0xffffffff at an offset represents the corresponding offset in the mmap'ed memory being free, any other value is the id of the array (which is needed for defragmentation of the blocks in my current implementation, blocks can't be moved without knowing the id of their array to update other data structures). On allocation and when an array outgrows its allocation it would simply iterate until it found enough free blocks, and insert at the corresponding offset.
This is what the allocation array and mmap'ed memory look like, kindof:
| 0xffffffff | 0xfffffff | 1234 | 1234 | 0xffffffff | ...
-----------------------------------------------------------------
| free | free |array1234[0]|array1234[1]| free | ...
This approach though has a memory overhead of offset of furthest used block in mmap'ed memory x 4
(4 bytes ber int).
What better approaches are there for this specific case?
My ideal requirements for this are:
- Memory overhead (any allocation tables + unused space) <= 1.5 bits per element + 4*6 bytes per array
- O(1) allocation and growing of arrays