5

I am learning x86 assembly out of curiosity. I'm currently using a Linux based OS with the NASM assembler. I am having a difficult time understanding why

SECTION .text

global _start

_start:

    nop
    mov ebx, 25
    mov [0xFFF], ebx

   ;Exit the program
   mov eax, 1
   mov ebx, 0
   int 0x80

Would lead to a segmentation fault (when moving the contents of the ebx register to memory location 0xFFF). I was thinking that building a program in pure asm would give me unrestricted access to my process' virtual address space. Is this not the case?

How would you implement something like a heap in assembly?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
crawfordr4
  • 147
  • 7
  • Even if the address is available, you are breaking the machine alignment by writing a 32-bit value to an odd address. You could write `ebx` to `0xFFC` or to `0x1000` but not to any address in between. – Weather Vane Nov 24 '15 at 09:57
  • Is 0xFFF properly allocated that means a valid address? – zx485 Nov 24 '15 at 10:29
  • 2
    @WeatherVane I doubt alignment is enforced (enforcing is optional on x86 for regular memory accesses, the CPU can deal with unaligned reads/writes). The problem is most likely there's no physical memory mapped into the address space at the location(s) of access. Page 0 (offsets 0 through 0xFFF) is often left unmapped to catch NULL pointers / zero addresses. And then, by default, the program image is loaded well above address zero, something like 0x08048000. So, why map memory where there's no program? – Alexey Frunze Nov 24 '15 at 11:40
  • @AlexeyFrunze yes my comment was weak: is enforced on some *other* processors, but in 8086 simply causes another read or write cycle. – Weather Vane Nov 24 '15 at 12:33
  • How can you access the full range of virtual memory that your program can use? How would you implement something like a heap in assembly? – crawfordr4 Nov 24 '15 at 14:28
  • Not all virtual memory is accessible to your program. If you want to allocate space for heap management from assembler, I suggest you look at the `brk` system call (int 0x80/eax=45). – Michael Petch Nov 24 '15 at 18:52
  • Ultimately is your real question about creating a heap in your assembler program? Is that the problem you are trying to solve? – Michael Petch Nov 24 '15 at 19:10

2 Answers2

6

On Linux(x86) - although you have a virtual address range of 4gb in your process, not all of it is accessible. The upper 1gb is where the kernel resides, and there are areas of low memory that can't be used. Virtual memory address 0xfff can't be written to or read from (by default) so your program crashes with a segfault.

In a followup comment you suggested you were intending to create a heap in assembler. That can be done, and one method is to use the sys_brk system call. It is accessed via int 0x80 and EAX=45 . It takes a pointer in EBX representing the new top of the heap. Generally the bottom of the heap area is initialized to the area just beyond your programs data segment(above your program in memory). To get the address of the initial heap location you can call sys_break with EBX set to 0. After the system call EAX will be the current base pointer of the heap. You can save that away when you need to access your heap memory or allocate more heap space.

This code provides an example for purposes of clarity (not performance), but might be a starting point to understanding how you can manipulate the heap area:

SECTION .data
heap_base: dd 0          ; Memory address for base of our heap

SECTION .text
global _start
_start:
    ; Use `brk` syscall to get current memory address
    ; For the bottom of our heap This can be achieved
    ; by calling brk with an address (EBX) of 0
    mov eax, 45          ; brk system call
    xor ebx, ebx         ; don't request additional space, we just want to 
                         ; get the memory address for the base of our processes heap area.
    int 0x80
    mov [heap_base], eax ; Save the heap base

    ;Now allocate some space (8192 bytes)
    mov eax, 45          ; brk system call
    mov ebx, [heap_base] ; ebx = address for base of heap
    add ebx, 0x2000      ; increase heap by 8192 bytes
    int 0x80

    ; Example usage
    mov eax, [heap_base]      ; Get pointer to the heap's base
    mov dword [eax+0xFFF], 25 ; mov value 25 to DWORD at heapbase+0xFFF

    ;Exit the program
    mov eax, 1
    xor ebx, ebx
    int 0x80
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • Thank you, this is incredibly helpful. Out of curiosity, and from a simplistic point of view, does the Linux kernel simply create enough pages for my program, then adds more as I change the size of the heap via sys_brk? When I try to access an address that has not been allocated to me yet, does a page fault occur, through which Linux's record keeping indicates that I've gone past my allocated limit for the segment that my heap is a member of? – crawfordr4 Nov 25 '15 at 20:02
  • Correct, the kernel will allocate the pages with correct permissions (I think in general read/write execute). If you were to access those memory locations without the sys_brk you would get a segfault. – Michael Petch Nov 25 '15 at 20:24
1

You don't have unrestricted RAM. Furthermore, you don't have unrestricted access to the part of your address space which is backed by RAM. Code pages are mapped read-only. And as a ring-3 program, you can't change that yourself.

MSalters
  • 173,980
  • 10
  • 155
  • 350