Context: I'm designing my own processor and instruction set as a learning exercise.
I'm trying to understand how a low-level assembly program knows which memory addresses it can access. Even if we assume we're running without virtual memory or OS protection and all of RAM is freely-writeable, we must make sure we don't accidentally overwrite our own instructions, as the program the data it manipulates are stored in the same address space.
As I see it, we have three types of memory at this level.
- Static memory: This would be declared symbolically in the DATA section of the assembly program. Somehow this gets magically transformed into usable addresses.
- Stack memory: Once the stack registers are set, we can happily push and pop. But how do we know where to initially put the base of the stack?
- Heap memory: User-mode programs will request this via a syscall, and the OS will handle all the difficult stuff. We just get an address in the return value and use it. However, if we're writing the kernel rather than a user-mode program, no such concept exists.
My questions boil down to the following:
- How and when do static data declarations get translated into actual addresses?
- For a user-mode program that will be loaded by an OS, how does it know where to put the stack?
- For a kernel-mode program that is the OS, how does it know which memory it can use for kernel data structures, allocating to processes etc? Must everything be declared static up-front?
Feel free to give examples from any architecture/OS implementation, I'm just trying to get my head round how this is possible at all.