-3

I am trying to replicate a stack buffer overflow. This is my code

#include <stdio.h>

int main(int argc, char *argv[]) {
  char x[1];
  gets(x);
  printf("%s\n", x);
}

I am compiling this on a 32 bit machine, which means each memory address is 4 bytes long. Since each character is 1 byte (verified using sizeof), I am expecting a stack buffer overflow when I enter "AAAAA" as input (1 byte more than what x can hold). However, nothing happens till I enter 13 As, at which point I get an "Illegal Instruction" error. 14 As results in a "Segmentation fault".

Questions

  1. Why am I not getting a segmentation fault at 5 As?
  2. What is the difference between Illegal Instruction and Segmentation Fault?
  3. What is a good tool (other than gdb) to visualize the stack?

I've looked at Trouble replicating a stack buffer overflow exploit, but I had trouble understanding the answer.

Here's my assembly dump:

(gdb) disassemble main
Dump of assembler code for function main:
   0x0804844d <+0>: push   %ebp
   0x0804844e <+1>: mov    %esp,%ebp
   0x08048450 <+3>: and    $0xfffffff0,%esp
   0x08048453 <+6>: sub    $0x20,%esp
   0x08048456 <+9>: lea    0x1f(%esp),%eax
   0x0804845a <+13>:    mov    %eax,(%esp)
   0x0804845d <+16>:    call   0x8048310 <gets@plt>
=> 0x08048462 <+21>:    lea    0x1f(%esp),%eax
   0x08048466 <+25>:    mov    %eax,(%esp)
   0x08048469 <+28>:    call   0x8048320 <puts@plt>
   0x0804846e <+33>:    leave  
   0x0804846f <+34>:    ret    
End of assembler dump.
max_max_mir
  • 1,494
  • 3
  • 20
  • 36
  • 4
    Use a debugger. Provide the disassembly of the provided function. Undefined behaviour is **undefined**, so there is not much to discuss in terms of C code. – Antti Haapala -- Слава Україні Jan 17 '18 at 04:40
  • I just added the disassembly. I am using gdb, but not sure how to visualize the stack and see where the stack pointer is compared to the variable x. – max_max_mir Jan 17 '18 at 04:51
  • You just posted this for the meta-ness of using the `stack-overflow` tag, didn't you. :-P – Charles Srstka Jan 17 '18 at 04:51
  • Should it not be tagged stack-overflow? Happy to remove it if it's not helpful – max_max_mir Jan 17 '18 at 04:52
  • (joking, hence the :-P) – Charles Srstka Jan 17 '18 at 04:52
  • Ah - sorry :). I have been struggling with this problem for a while now, so .... – max_max_mir Jan 17 '18 at 04:53
  • 1
    So, what does the `and` + `sub` do? Try it with different values of `esp`. For visualization I suggest using *a writing implement or art medium constructed of a narrow, solid pigment core inside a protective casing* to *draw* some kinds of markings on *a thin material produced by pressing together moist fibres of cellulose pulp derived from wood, rags or grasses, and drying them into flexible sheets.* – Antti Haapala -- Слава Україні Jan 17 '18 at 04:58
  • Notice that the overflow does not happen in C. In C there is only undefined behaviour. Everything happens in machine code. – Antti Haapala -- Слава Україні Jan 17 '18 at 05:01
  • Thanks. I don't know how to read assembly code, so not sure what and + sub does, or how to try a different value of esp. – max_max_mir Jan 17 '18 at 05:03
  • I am also note sure why this question was down voted. – max_max_mir Jan 17 '18 at 05:26
  • 4
    If you don't know how to read/understand/write assembly, I don't think you can do anything useful with a Stack exploit ad this. I think you are starting from the wrong end. You should try to understand how machine languages work, how compilers generate code and how the operating system behaves when something "bad" happens. – Ajay Brahmakshatriya Jan 17 '18 at 06:13
  • What do you mean by 5 As, 13 As and so on ? If it is an assembly line number it does not match your assembly code. Also you should remove all optimizations, it makes understand a "C" program by reading assembly much easy. – Joël Hecht Jan 17 '18 at 06:24
  • @JoëlHecht: that's OP's "when I enter "AAAAA" *as input*" with `gets` in the way too small char array `x`. – Jongware Jan 17 '18 at 12:01

1 Answers1

1
  1. the stack is 16-byte aligned right after the main function executed ( at line 3 ). so you cannot just calculate exact address for saved return address, you can just try from 5 bytes to 21 bytes.
  2. Illegal instruction is such bytes which didn't match with any defined instruction. every instruction is represented in machine code (Ex: push ebp is 0x55 , etc) but for example, 0xff 0xff is not matched with any instruction in x86 machine. but segmentation fault is occured when any memory access is invalid.
j31d0
  • 141
  • 5
  • Thank you! Would you mind explaining #1 in a bit more detail. How do I find out how the stack is set up? So does 16 byte right aligned mean that every stack cell is 16 bytes long? So even if I declare a 1 byte char variable, the OS uses 16 bytes to store it? Also, why is where the return address saved not deterministic? – max_max_mir Jan 17 '18 at 19:04
  • First, you can guess the stack address based on x64 ABI (when call instruction executes, at that time stack must be 16-byte aligned), but It isn't always right because of optimization by compiler or something. stack align means rsp should be multiple of number (in this case, 16). Your declared variable is not related about stack alignment. It is done (maybe) because you called function in main - then by ABI, rsp must be multiple of 16 at call instruction. – j31d0 Jan 18 '18 at 01:16
  • also there are no reason for saved return address to be deterministic. compiler makes code to satisfy ABI, and non-deterministic behavior for saved return address is just side effect. – j31d0 Jan 18 '18 at 01:19