1

Wanted to follow what it is done in this article: https://surma.dev/things/c-to-webassembly/ but with Rust and write a custom allocator.

For that I would need to access __heap_base variable that llvm adds as a pointer of where the heap starts in linear memory. Is there a way to achieve this in Rust?

I tried variations of

extern "C" {
    static __heap_base: i32;
}

#[no_mangle]
pub unsafe extern "C" fn main() -> i32 {
    __heap_base
}

but they return 0 instead of the actual value assigned in the binary.

  • 1
    The article you linked takes an address of `_heap_base`, the extern symbol itself is not a pointer. Have you tried that? I don't think the value of `_heap_base` is relevant - if it simply marks the start of heap, you will overwrite that value with your first dynamically allocated object anyway? – justinas Nov 16 '21 at 19:17
  • @justinas that seems to work but I don't understand why as __heap_base is literally an i32 value in the wasm: `(global $__heap_base (export "__heap_base") i32 (i32.const 1048576))`. Don't understand the second part of the question __heap_base is where the heap starts afaik. So I just want to know that value so as to know where are the valid places (from that index onwards as it is linear memory) I can allocate on the heap. – Federico Rodríguez Nov 17 '21 at 12:22
  • 1
    Sorry, I don't have much experience with WASM, so I might not be able to help further. I do not know what the *value* of `__heap_base` is supposed to indicate, but the article seems to not care about it (even uses it as an `unsigned char` rather than `i32`), and simply take a pointer to it and use it for incrementing. – justinas Nov 17 '21 at 12:53

1 Answers1

0

After working a bit with this. An idea of an answer is that there seems to be a difference between the values of your program and the values the compiler/linker then defines in the wasm file. There's not a 1 to 1 relationship in principle.

When you define a variable in C/Rust you get the variable and not the address of the variable itself. I.e: if you define a pointer you get the address of the data the pointer points to and not the address of where the value of that pointer is stored.

So by specifying static __heap_base: i32 you are asking the compiler for __heap_base the value to be an i32, not heap base the pointer (which is what llvm then writes as a wasm i32 whatever type you set for __heap_base). The address of that value is the actual pointer to the __heap_base

Why you can just import __heap_base as the value that is pointed to by heap base still is not that clear to me. Maybe symbols always mean values and something like *__heap_base is just the pointer which when dereferenced gives you __heap_base (the value) and it's treated like this internally