Why does Rust std::alloc allocate with larger than expected gaps?

Question

If I were to make two allocs using an example layout with size 256 and alignment 1024, I would expect the second allocation to be at the first multiple of 1024 after the first (ie ptr2 - ptr1 == 1024). Instead I am finding that there is a gap twice the size of 2048 bytes between the two allocs.

use std::alloc::alloc;
use std::alloc::Layout;

fn main() {
    unsafe {
        let size = 256;
        let alignment = 1024
        let large_layout = Layout::from_size_align_unchecked(size, alignment);
        let ptr1 = alloc(large_layout) as usize;
        let ptr2 = alloc(large_layout) as usize;

        // I would expect this to print 1024, but it prints 2048...
        println!("Difference1: {}", ptr2 - ptr1);
    }
}

As I understand, the alignment makes it so that allocs only occur at multiples of the alignment, which does seem to be true, but it also seems like something else is going on. I know that an alloc also needs a word of space for the size of the alloc, which could explain in some cases why the gap might be larger than expected. However, in the case of size = 256, and alignment = 1024, there should be plenty of space between allocs allowing for them to be alloc'ed back to back? Here are some results of my experimentation between gaps of the pointers with different sizes and alignments. I'm confused at the examples where it seems that instead of rounding up to the nearest alignment, the gap is double what I expect.

| size | alignment | gap  |
| ---- | --------- | ---- |
| 4    | 32        | 32   |
| 8    | 32        | 32   |
| 16   | 32        | 64   | ???
| 32   | 32        | 64   | 
| ---- | --------- | ---- |
| 4    | 64        | 64   |
| 8    | 64        | 64   |
| 16   | 64        | 64   |
| 32   | 64        | 128  | ???
| 64   | 64        | 128  | 
| ---- | --------- | ---- |
| 256  | 1024      | 2048 | ???
| 512  | 1024      | 2048 | ???
| 1024 | 1024      | 2048 | 
| ---- | --------- | ---- |

That's an extraordinarily large alignment value, isn't it? Why are your size values always <= alignment? In practice it should be the other way around. — tadman, Jul 17 '23 at 20:18
yeah, you are probably right haha, I am still learning a lot about memory management, and this quite a niche problem. But, I am trying to make a memory manager for an interpreter that allocates blocks of 4kb at a time. While the interpreter is parsing, it runs the std::alloc quite a bit to create the AST, and to help generate bytecode. It then fully switches over to the custom allocator to run the program. Without the large 4kb alignment for blocks, I'm worried that the small allocs made at the start of the program could cause fragmentation, with the worst case possible being a hole of 3.99kb. — Niland Schumacher, Jul 17 '23 at 20:34
I think your unusual alignment requirements might cause fragmentation. If you're allocating a lot of the same kind of thing, consider allocating them in larger chunks, then using a pool for sub-allocations, or even just cramming this all into a `Vec`. You may not even need to hit up the allocator directly. — tadman, Jul 17 '23 at 23:10

score 2 · Answer 1 · answered Jul 18 '23 at 09:09

Interface vs Implementation

First of all, std::alloc::alloc is an interface, not an implementation. Depending on your platforms, and the flags passed to the standard library, you may end up with the system allocator (which differs between Windows and Unix) or the musl allocator, etc...

std::alloc::alloc is just a thin abstraction layer about whichever memory allocator the standard library uses, providing a uniform interface, but not necessarily a uniform behavior: the only behavior guarantees provided are (simplifying) that if you do get a pointer, it'll obey the layout required, and the slice of memory provided will not overlap with any other allocation for its entire lifetime... and that's about it.

Size vs Alignment

For efficiency sake, memory allocators tend to operate with slabs, especially for low sizes. In short, they pick one area of memory, and slice it in blocks of equal sizes. That is:

A slabs of up to 8 bytes blocks.
B slabs of up to 12 bytes blocks.
C slabs of up to 16 bytes blocks.
...

This means that when they receive a request requiring an alignment of 2ⁿ bytes, they'll pick a slab with a block size of at least 2ⁿ bytes as that's the easiest way to fulfill the request.

This is actually hinted at in Layout::from_size_align:

Constructs a Layout from a given size and align, or returns LayoutError if any of the following conditions are not met:

align must not be zero,

align must be a power of two,

size, when rounded up to the nearest multiple of align, must not overflow isize (i.e., the rounded value must be less than or equal to isize::MAX).

Note the last line, talking about size being rounded up to the nearest multiple of align.

It's not guaranteed that rounding will occur, but the implementation may wish to round up, and therefore the guarantee in Layout ensures that it can do so soundly.

Small Gaps

To be sure, you'd need to identify which underlying implementation you are using -- likely your system allocator, which depends on the platform you're using -- and someone would need to dive into the source code.

It's possible that a header is prepended to the allocation by whichever memory allocator you are using, where it stores its own metadata, which would require "padding" the allocated size, and would sometimes result in "bumping" an allocation to the next allocation class.

Or you may see a safety feature at play, where canaries are prepended/appended to catch stray memory writes a posteriori.

Or...

Big Gaps

Slabs are not typically used for large sizes. Instead, at some point, the allocator will typically switch to using round numbers of OS pages. On x86, the typical OS page is 4KB, so when getting closer to 4KB you'll see a switch of behavior: goodbye fine-grained slabs, hello coarse-grained pages.

It's quite possible that the particular allocator you are using starts switching at 1KB already; in particular, it's quite possible that for 1KB it doesn't attempt to fit the header within the allocation itself (even when you ask for a lower size).

But Gaps!

Yes, gaps.

As with any piece of software, memory allocators have trade-offs. In general, on modern systems, they'll tend to favor quick allocation/deallocation over tight memory usage. And that means they won't be optimized to fit as many allocations in as tight a space as possible; not at the detriment of speed anyway.

Because, let's face it, users care most about speed.