0

I'm trying to write my own malloc and free implementation for the sake of learning, with just mmap and munmap (since brk and sbrk are obsoletes). I've read a fair amount of documentation on the subject, but every example I see either use sbrk or doesn't explain very well how to handle large zones of mapped memory.

The idea of what I'm trying to write is this: I first map a big zone (i.e. 512 pages); this zone will contains all allocations between 1 and 992 bytes, in 16 bytes increments. I'll do the same later with a 4096 pages zone for bigger allocations (or mmap directly if the requested size is bigger than a page). So I need a way to store informations about every chunk that I allocate or free.

My question is, how do I handle these informations properly ?

My problematics are: If I create a linked list, how do I allocate more space for each node ? Or do I need to copy it to the mapped zone ? If so, how can I juggle between data space and reserved space ? Or is it better to use a static sized array ? Problem with this is that my zone's size depends on the page size.

Willy
  • 763
  • 2
  • 8
  • 29
  • I wrote [a simple one using `mmap` (no zones) a short while ago](https://github.com/boazsegev/facil.io/blob/87dd5e918ee9edec82f8ffc13bf72940583253e2/src/bscrypt/bscrypt/unused/mempool.c). For memory zone chunks that allocate a set size (say, all 16 byte sized allocations, or all 32 byte size allocations), a bitmap might perform better than a linked list when you consider memory overhead (a linked list might take 16 bytes for every allocation, just for maintaining the 16 byte memory alignment needed for some SSE operations).... I used 8 byte alignment for 64 bit machines (ignoring SSE). – Myst Dec 12 '16 at 15:48
  • 1
    This all depends greatly on the particular problem your custom allocator is designed to solve for your application and is by-and-large a datastructure issue orthogonal to the specific use of `mmap` as the base allocator unless you start playing virtual memory tricks. Can you elaborate a little on what your allocation patterns look like, and is phyiscal memory, address space or time scarce? Do you need hard time/space/fragmentation guarantees? Is compaction viable? Anyway, it is customary to store metadata internally as block headers. – doynax Dec 14 '16 at 17:09

1 Answers1

9

There are several possible implementations for a mmap-based malloc:

Sequential (first-fit, best-fit).

Idea: Use a linked list with the last chunk sized to the remaining size of your page.

struct chunk
{
   size_t size;
   struct chunk *next;
   int is_free;
}
  1. To allocate
    1. Iterate your list for a suitable free chunk (optimizable)
    2. If nothing's found, resize the last chunk to the required size and create a free chunk to the remaining size.
    3. If you reach the end of the page, (the size is too small, and next is NULL), simply mmap a new page (optimisable: generate a custom page if the size is abnormal ...)
  2. To free, even simpler: simply set is_free to 1. optionally, you can check if the next chunk is also free and merge both in a bigger free chunk (watch out for page borders).

Pros: Easy to implement, trivial to understand, simple to tweak.

Cons: not very efficient (iterate your whole list to find a block?), need lots of optimisation, hectic memory organization

Binary buddies (I love binary arithmetics and recursion)

Idea: Use powers-of-2 as size units:

struct chunk
{
   size_t size;
   int is_free;
}

the structure here does not need a next as you'll see.

The principle is the following:

  • You have a 4096-bytes page. that is (-16 for metadata) 4080 usable bytes
  • To allocate a small block, simply split up the page in two 2048-bytes chunks, and split again the first half in 1028-bytes chunks... until you get a suitable usable space (minimum at 32-bytes (16 usable)).
  • Every block, if it isn't a full page, has a buddy.
  • You end up with a tree-like structure like this: like this
  • to access your buddy, use a binary XOR between your pointer and your block size.

Implementation:

  1. Allocating a block of size Size
    1. Get the required Block_size = 2^k > size + sizeof(chunk)
    2. find the smallest free space in the tree that fits block_size
    3. If it can get smaller, Split it, recursively.
  2. Freeing a block
    1. Setting is_free to 1
    2. checking if your buddy is free (XOR size, don't forget to verify he's the same size as you)
    3. if he is, double his size. Recurse.

Pros: Extremely fast and memory-efficient, clean.

Cons: Complicated, a few tricky cases (page borders and buddy sizes) Need to keep a list of your pages

Buckets (I have a lot of time to lose)

This is the only method of the three I have not attempted to implement myself, so I can only speak of the Theory:

struct bucket
{
  size_t buck_num;  //number of data segment
  size_t buck_size; //size of a data segment
  void *page;
  void *freeinfo;
}
  • You have from the start a few small pages, each split in blocks of constant size (one 8-bytes page, one 16-bytes, one 32-bytes and so on)
  • The "freedom information" of those data buckets are stored in bitsets (structures representing a large set of ints) either at the start of each page, or in a separate memory zone.

for example, for a 512-bytes bucket in a 4096 bytes pages, the bitset representing it would be a 8-bit bitset, supposing *freeinfo = 01001000, this would mean the second and fifth buckets are free.

Pros: By far the fastest and cleanest over the long run, Most efficient on many small allocations

Cons: Very cumbersome to implement, quite heavy for a small program, need for a separate memory space for bitsets.

There are probably other algorithms and implementations but those three are the most used, So I hope you can get a lead on what you want to do from this.

Adalcar
  • 1,458
  • 11
  • 26
  • About buddy allocator: how will you receive a block from the row pointer `free(void *p)`? Do you have some preambles or postambles or just some metadata storing directly in the free space itself? – dshil Mar 07 '18 at 05:09
  • @dshil I am not exactly sure of what you're asking, as far as metadata, it is at the start of each block, making their effective capacity decrease by 16 bytes (depending on size_t's size). – Adalcar Mar 07 '18 at 09:11
  • it's exactly what I asked. – dshil Mar 09 '18 at 08:51