How to handle memory management via mmap properly?

Question

I'm trying to write my own malloc and free implementation for the sake of learning, with just mmap and munmap (since brk and sbrk are obsoletes). I've read a fair amount of documentation on the subject, but every example I see either use sbrk or doesn't explain very well how to handle large zones of mapped memory.

The idea of what I'm trying to write is this: I first map a big zone (i.e. 512 pages); this zone will contains all allocations between 1 and 992 bytes, in 16 bytes increments. I'll do the same later with a 4096 pages zone for bigger allocations (or mmap directly if the requested size is bigger than a page). So I need a way to store informations about every chunk that I allocate or free.

My question is, how do I handle these informations properly ?

My problematics are: If I create a linked list, how do I allocate more space for each node ? Or do I need to copy it to the mapped zone ? If so, how can I juggle between data space and reserved space ? Or is it better to use a static sized array ? Problem with this is that my zone's size depends on the page size.

I wrote [a simple one using `mmap` (no zones) a short while ago](https://github.com/boazsegev/facil.io/blob/87dd5e918ee9edec82f8ffc13bf72940583253e2/src/bscrypt/bscrypt/unused/mempool.c). For memory zone chunks that allocate a set size (say, all 16 byte sized allocations, or all 32 byte size allocations), a bitmap might perform better than a linked list when you consider memory overhead (a linked list might take 16 bytes for every allocation, just for maintaining the 16 byte memory alignment needed for some SSE operations).... I used 8 byte alignment for 64 bit machines (ignoring SSE). — Myst, Dec 12 '16 at 15:48
This all depends greatly on the particular problem your custom allocator is designed to solve for your application and is by-and-large a datastructure issue orthogonal to the specific use of `mmap` as the base allocator unless you start playing virtual memory tricks. Can you elaborate a little on what your allocation patterns look like, and is phyiscal memory, address space or time scarce? Do you need hard time/space/fragmentation guarantees? Is compaction viable? Anyway, it is customary to store metadata internally as block headers. — doynax, Dec 14 '16 at 17:09

Adalcar · Accepted Answer · 2017-08-27T16:17:20.903

There are several possible implementations for a mmap-based malloc:

Sequential (first-fit, best-fit).

Idea: Use a linked list with the last chunk sized to the remaining size of your page.

struct chunk
{
   size_t size;
   struct chunk *next;
   int is_free;
}

To allocate
1. Iterate your list for a suitable free chunk (optimizable)
2. If nothing's found, resize the last chunk to the required size and create a free chunk to the remaining size.
3. If you reach the end of the page, (the size is too small, and next is NULL), simply mmap a new page (optimisable: generate a custom page if the size is abnormal ...)
To free, even simpler: simply set is_free to 1. optionally, you can check if the next chunk is also free and merge both in a bigger free chunk (watch out for page borders).

Pros: Easy to implement, trivial to understand, simple to tweak.

Cons: not very efficient (iterate your whole list to find a block?), need lots of optimisation, hectic memory organization

Binary buddies (I love binary arithmetics and recursion)

Idea: Use powers-of-2 as size units:

struct chunk
{
   size_t size;
   int is_free;
}

the structure here does not need a next as you'll see.

The principle is the following:

You have a 4096-bytes page. that is (-16 for metadata) 4080 usable bytes
To allocate a small block, simply split up the page in two 2048-bytes chunks, and split again the first half in 1028-bytes chunks... until you get a suitable usable space (minimum at 32-bytes (16 usable)).
Every block, if it isn't a full page, has a buddy.
You end up with a tree-like structure like this:
to access your buddy, use a binary XOR between your pointer and your block size.

Implementation:

Allocating a block of size Size
1. Get the required Block_size = 2^k > size + sizeof(chunk)
2. find the smallest free space in the tree that fits block_size
3. If it can get smaller, Split it, recursively.
Freeing a block
1. Setting is_free to 1
2. checking if your buddy is free (XOR size, don't forget to verify he's the same size as you)
3. if he is, double his size. Recurse.

Pros: Extremely fast and memory-efficient, clean.

Cons: Complicated, a few tricky cases (page borders and buddy sizes) Need to keep a list of your pages

Buckets (I have a lot of time to lose)

This is the only method of the three I have not attempted to implement myself, so I can only speak of the Theory:

struct bucket
{
  size_t buck_num;  //number of data segment
  size_t buck_size; //size of a data segment
  void *page;
  void *freeinfo;
}

You have from the start a few small pages, each split in blocks of constant size (one 8-bytes page, one 16-bytes, one 32-bytes and so on)
The "freedom information" of those data buckets are stored in bitsets (structures representing a large set of ints) either at the start of each page, or in a separate memory zone.

for example, for a 512-bytes bucket in a 4096 bytes pages, the bitset representing it would be a 8-bit bitset, supposing *freeinfo = 01001000, this would mean the second and fifth buckets are free.

Pros: By far the fastest and cleanest over the long run, Most efficient on many small allocations

Cons: Very cumbersome to implement, quite heavy for a small program, need for a separate memory space for bitsets.

There are probably other algorithms and implementations but those three are the most used, So I hope you can get a lead on what you want to do from this.

About buddy allocator: how will you receive a block from the row pointer `free(void *p)`? Do you have some preambles or postambles or just some metadata storing directly in the free space itself? — dshil, Mar 07 '18 at 05:09
@dshil I am not exactly sure of what you're asking, as far as metadata, it is at the start of each block, making their effective capacity decrease by 16 bytes (depending on size_t's size). — Adalcar, Mar 07 '18 at 09:11

How to handle memory management via mmap properly?

1 Answers1

There are several possible implementations for a mmap-based malloc:

Linked