How to implement deterministic malloc

Question

Say I have two instances of an application, with the same inputs and same execution sequence. Therefore, one instance is a redundant one and is used for comparing data in memory with the other instance, as a kind of error detection mechanism.

Now, I want all memory allocations and deallocations to happen in exactly the same manner in the two processes. What is the easiest way to achieve that? Write my own malloc and free? And what about memories allocated with other functions such as mmap?

What do you mean, in exactly the same manner? I thought you said they have "the same execution sequence"? — buddhabrot, Dec 07 '11 at 13:09
You may want a 3rd instance to figure out which of the different results is/are correct. — Alexey Frunze, Dec 07 '11 at 13:13
The execution sequence is the same but malloc addresses can differ because malloc is not deterministic. — MetallicPriest, Dec 07 '11 at 13:13
`malloc` is deterministic as is every computer algorithm. Computer are really bad at generating non-deterministic (random number) values by design. What may not be deterministic is memory returned by anonymous `mmap`. — Sylvain Defresne, Dec 07 '11 at 13:16
@Sylvain: From where you have read that malloc is deterministic? Can you give some reference? — MetallicPriest, Dec 07 '11 at 13:35
Duplicate of http://stackoverflow.com/questions/8171006/is-malloc-deterministic — Basile Starynkevitch, Dec 07 '11 at 13:45
I can't even see how the 'same inputs and same execution sequence' could be started and maintained, neve mind any problems with malloc(). The OS will schedule the thread/s of the process pairs at different times and so execution will diverge. One process will generate a page fault and lock a page for writing so that the other process has to wait for a disk revolution. Things like that. Am I the only one who thinks that problems with 'malloc()' are not top of the list for this requirement? — Martin James, Dec 07 '11 at 14:27
@Martin: I'm working with virtual address space, so I'm not bothered about the underlying page faults mechanisms and disk revolutions etc..., so that is completely a non-issue! — MetallicPriest, Dec 07 '11 at 14:54
@MetallicPriest well, good luck. I'm glad I'm not trying this! I'm guessing that the two instances are, in fact, running on separate hardware. — Martin James, Dec 07 '11 at 15:33
@MetallicPriest From the source of glibc implementation of `malloc` (http://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c;h=8608083adbe65c530a0d8ac3bbf547d85586b678;hb=HEAD). The algorithm looks deterministic (not random, or reading uninitialized memory). However, it does call `mmap` for large block allocation that is not deterministic. — Sylvain Defresne, Dec 07 '11 at 16:03

score 4 · Accepted Answer · edited Jun 29 '19 at 22:26

I'm wondering what you are trying to achieve. If your process is deterministic, then the pattern of allocation / deallocation should be the same.

The only possible difference could be the address returned by malloc. But you should probably not depend on them (the easiest way being not using pointers as key map or other data structure). And even then, there should only be difference if the allocation is not done through sbrk (the glibc use anonymous mmap for large allocations), or if you are using mmap (as by default the address is selected by the kernel).

If you really want to have exactly the same address, one option is to have a large static buffer and to write a custom allocator that does use memory from this buffer. This has the disadvantage of forcing you to know beforehand the maximum amount of memory you'll ever need. In a non-PIE executable (gcc -fno-pie -no-pie), a static buffer will have the same address every time. For a PIE executable you can disable the kernel's address space layout randomization for loading programs. In a shared library, disabling ASLR and running the same program twice should lead to the same choices by the dynamic linker for where to map libraries.

If you don't know before hand the maximum size of the memory you want to use, or if you don't want to recompile each time this size increase, you can also use mmap to map a large anonymous buffer at a fixed address. Simply pass the size of the buffer and the address to use as parameter to your process and use the returned memory to implement your own malloc on top of it.

static void* malloc_buffer = NULL;
static size_t malloc_buffer_len = 0;

void* malloc(size_t size) {
    // Use malloc_buffer & malloc_buffer_len to implement your
    // own allocator. If you don't read uninitialized memory,
    // it can be deterministic.
    return memory;
}

int main(int argc, char** argv) {
    size_t buf_size = 0;
    uintptr_t buf_addr = 0;
    for (int i = 0; i < argv; ++i) {
        if (strcmp(argv[i], "--malloc-size") == 0) {
            buf_size = atoi(argv[++i]);
        }
        if (strcmp(argv[i], "--malloc-addr") == 0) {
            buf_addr = atoi(argv[++i]);
        }
    }

    malloc_buffer = mmap((void*)buf_addr, buf_size, PROT_WRITE|PROT_READ,
                         MAP_FIXED|MAP_PRIVATE, 0, 0);
    // editor's note: omit MAP_FIXED since you're checking the result anyway
    if (malloc_buffer == MAP_FAILED || malloc_buffer != (void*)but_addr) {
        // Could not get requested memory block, fail.
        exit(1);
    }

    malloc_size = buf_size;
}

By using MAP_FIXED, we are telling the kernel to replace any existing mappings that overlap with this new one at buf_addr.

(Editor's note: MAP_FIXED is probably not what you want. Specifying buf_addr as a hint instead of NULL already requests that address if possible. With MAP_FIXED, mmap will either return an error or the address you gave it. The malloc_buffer != (void*)but_addr check makes sense for the non-FIXED case, which won't replace an existing mapping of your code or a shared library or anything else. Linux 4.17 introduced MAP_FIXED_NOREPLACE which you can use to make mmap return an error instead of memory at the wrong address you don't want to use. But still leave the check in so your code works on older kernels.)

If you use this block to implement your own malloc and don't use other non-deterministic operation in your code, you can have complete control of the pointer values.

This suppose that your pattern usage of malloc / free is deterministic. And that you don't use libraries that are non-deterministic.

However, I think a simpler solution is to keep your algorithms deterministic and not to depend on addresses to be. This is possible. I've worked on a large scale project were multiple computer had to update state deterministically (so that each program had the same state, while only transmitting inputs). If you don't use pointer for other things than referencing objects (most important things is to never use pointer value for anything, not as a hash, not as a key in a map, ...), then your state will stay deterministic.

Unless what you want to do is to be able to snapshot the whole process memory and do a binary diff to spot divergence. I think it's a bad idea, because how will you know that both of them have reached the same point in their computation? It is much more easier to compare the output, or to have the process be able to compute a hash of the state and use that to check that they are in sync because you can control when this is done (and thus it become deterministic too, otherwise your measurement is non-deterministic).

"If you really want to have exactly the same address, one option is to have a large static buffer and to write a custom allocator that does use memory from this buffer.". This is exactly what I was doing up till now, but now I want to move to a more general purpose approach. — MetallicPriest, Dec 07 '11 at 13:24
@Sylvain. Apparently MetallicPriest wants to make some checkpointing mechanism. But he is very secretive. — Basile Starynkevitch, Dec 07 '11 at 14:09
It's not just the kernel that decides to ASLR or not: the executable has to support it. If you want static addresses to be the same every time, simply build a non-PIE executable with **`gcc -fno-pie -no-pie`**. (Modern distros have their GCC default to making PIEs.) See [32-bit absolute addresses no longer allowed in x86-64 Linux?](//stackoverflow.com/q/43367427). But if your code is in a shared library, then yes you'd need to disable ASLR with a kernel sysctl or something. I think dynamic linking just uses `mmap`, the kernel itself doesn't "know" it's a library being mapped. — Peter Cordes, Jun 29 '19 at 22:10

score 4 · Answer 2 · edited May 23 '17 at 10:34

4

What is not deterministic is not only malloc but mmap (the basic syscall to get more memory space; it is not a function, it is a system call so is elementary or atomic from the application's point of view; so you cannot rewrite it within the application) because of address space layout randomization on Linux.

You could disable it with

 echo 0 > /proc/sys/kernel/randomize_va_space

as root, or thru sysctl.

If you don't disable address space layout randomization you are stuck.

And you did ask a similar question previously, where I explained that your malloc-s won't always be deterministic.

I still think that for some practical applications, malloc cannot be deterministic. Imagine for instance a program having an hash-table keyed by the pid-s of the child processes it is launching. Collision in that table won't be the same in all your processes, etc.

So I believe you won't succeed in making malloc deterministic in your sense, whatever you'll try (unless you restrict yourself to a very narrow class of applications to checkpoint, so narrow that your software won't be very useful).

edited May 23 '17 at 10:34

Community

1
1

answered Dec 07 '11 at 13:51

Basile Starynkevitch

223,805
18
296
547

So you say it is only non-deterministic due to ASLR and otherwise it is deterministic? Are you sure? Does this apply for both sbrk and mmap? How can you disable ASLR if you don't have administrative rights? – MetallicPriest Dec 07 '11 at 14:14
I don't know about `sbrk` but I believe that yes. AFAIK, *ASLR* requires `root` privilege to be disabled (otherwise that is a huge security hole). – Basile Starynkevitch Dec 07 '11 at 14:43
Ya but somewhere I read setarch x86_64 -R can be used for a session. – MetallicPriest Dec 07 '11 at 14:56
1

I don't understand how having a hash table indexed by `pid` will make `malloc` non-deterministic. It will make the *process* non-deterministic in its usage of `malloc`, but the function itself is still deterministic. I've worked on a project that required determinism and if you take care of not introducing non-determinism from outside (as you said `pid` from other process or using pointer has hash), and don't use x87 FPU (but instead use SSE) then your program can be deterministic. – Sylvain Defresne Dec 07 '11 at 16:08

score 4 · Answer 3 · answered Dec 07 '11 at 15:43

Simply put, as others have stated: if the execution of your program's instructions is deterministic, then memory returned by malloc() will be deterministic. That assumes your system's implementation doesn't have some call to random() or something to that effect. If you are unsure, read the code or documentation for your system's malloc.

This is with the possible exception of ASLR, as others have also stated. If you don't have root privileges, you can disable it per-process via the personality(2) syscall and the ADDR_NO_RANDOMIZE parameter. See here for more information on the personalities.

Edit: I should also say, if you are unaware: what you're doing is called bisimulation and is a well-studied technique. If you didn't know the terminology, it might help to have that keyword for searching.

score 2 · Answer 4 · answered Dec 07 '11 at 13:09

2

When writing high-reliability code, the usual practise is to avoid malloc and other dynamic memory allocation. A compromise sometimes used is to do all such allocation only during system initialisation.

answered Dec 07 '11 at 13:09

Martin Thompson

16,395
1
38
56

score -1 · Answer 5 · answered Dec 07 '11 at 13:59

-1

You can used shared memory to store your data. It will accessible from both processes and you can fill it in a deterministic way.

answered Dec 07 '11 at 13:59

puikos

300
1
4

-1, Oh common please think before you write. The whole purpose of doing redundant execution is to be able to compare memories of the two processes executing in their own address space. – MetallicPriest Dec 07 '11 at 14:12

How to implement deterministic malloc

5 Answers5