4

I want to execute some code from memory; my longterm goal is to create a self-decrypting app. To understand the matter I started from the roots. I created the following code:

#define UNENCRYPTED true

#define sizeof_function(x) ( (unsigned long) (&(endof_##x)) - (unsigned long) (&x))
#define endof_function(x) void volatile endof_##x() {}
#define DECLARE_END_OF_FUNCTION(x) void endof_##x();

#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/types.h>

unsigned char *bin;

#ifdef UNENCRYPTED
void hexdump(char *description, unsigned char *toDump, unsigned long length) {
    printf("Hex-dump of \"%s\":\n", description);
    for (int i = 0; i < length; i++) {
        printf("%02x", toDump[i]);
    }
    printf("\n");
}


void hello_world() {
    printf("Hello World!\n");
}
endof_function(hello_world);
#endif

int main (void) {
    errno = 0;
    unsigned long hello_worldSize = sizeof_function(hello_world);
    bin = malloc(hello_worldSize);

    //Compute the start of the page
    size_t pagesize = sysconf(_SC_PAGESIZE);
    uintptr_t start = (uintptr_t) bin;
    uintptr_t end = start + (hello_worldSize);
    uintptr_t pagestart = start & -pagesize;
    bin = (void *)pagestart;

    //Set mprotect for bin to write-only
    if(mprotect(bin, end - pagestart, PROT_WRITE) == -1) {
        printf("\"mprotect\" failed; error: %s\n", strerror(errno));
        return(1);
    }

    //Get size and adresses
    unsigned long hello_worldAdress = (uintptr_t)&hello_world;
    unsigned long binAdress = (uintptr_t)bin;

    printf("Address of hello_world %lu\nSize of hello_world %lu\nAdress of bin:%lu\n", hello_worldAdress, hello_worldSize, binAdress);

    //Check if hello_worldAdress really points to hello_world()
    void (*checkAdress)(void) = (void *)hello_worldAdress;
    checkAdress();

    //Print memory contents of hello_world()
    hexdump("hello_world", (void *)&hello_world, hello_worldSize);

    //Copy hello_world() to bin
    memcpy(bin, (void *)hello_worldAdress, hello_worldSize);

    //Set mprotect for bin to read-execute
    if(mprotect(bin, end - pagestart, PROT_READ|PROT_EXEC) == -1) {
        printf("\"mprotect\" failed; error: %s\n", strerror(errno));
        return(1);
    }

    //Check if the contents at binAdress are the same as of hello_world
    hexdump("bin", (void *)binAdress, hello_worldSize);

    //Execute binAdress
    void (*executeBin)(void) = (void *)binAdress;
    executeBin();

    return(0);
}

However I get an segfault-error; the programs output is the following:

(On OS X; i86-64):

Adress of hello_world 4294970639

Size of hello_world 17
Adress of bin:4296028160
Hello World!
Hex-dump of "hello_world":
554889e5488d3d670200005de95a010000
Hex-dump of "bin":
554889e5488d3d670200005de95a010000
Program ended with exit code: 9

And on my Raspi (Linux with 32-Bit ARM):

Adress of hello_world 67688
Size of hello_world 36
Hello World!
Hello World!
Hex-dump of "hello_world":
00482de90db0a0e108d04de20c009fe512ffffeb04008de50bd0a0e10088bde8d20b0100
Hex-dump of "bin":
00482de90db0a0e108d04de20c009fe512ffffeb04008de50bd0a0e10088bde8d20b0100
Speicherzugriffsfehler //This is german for memory access error

Where is my mistake?


The problem was, that the printf-call in hello_world is based on a relative jump address, which of course doesn't work in the copied function. For testing purposes I changed hello_world to:

int hello_world() {
    //_printf("Hello World!\n");
    return 14;
}

and the code under "//Execute binAdress" to:

int (*executeBin)(void) = (void *)binAdress;
int test = executeBin();
printf("Value: %i\n", test);

which prints out 14 :D

Infinite Recursion
  • 6,511
  • 28
  • 39
  • 51
K. Biermann
  • 1,295
  • 10
  • 22
  • 5
    Your x64-compiled snippet ends with `e95a010000`, which is a jump to another address. This jump's destination is computed relative to the address of the jump instruction itself — when you copied the assembly code to another location in memory, the jump destination went haywire. – DCoder Sep 10 '14 at 17:26
  • Ok, if I'm getting this right, I'd need to replace printf with an absolute reference to printf? – K. Biermann Sep 10 '14 at 17:29
  • 1
    Yes, that approach should help solve this problem. – DCoder Sep 10 '14 at 17:44
  • 2
    Yeah, this was the problem :D I changed the function type to int and replaced the printf call with an simple "return 14". Now the code get's executed and 14 is returned. @DCoder If you create an answer I'll accept it :) – K. Biermann Sep 10 '14 at 17:58
  • Accept nneoneo's answer instead :) – DCoder Sep 10 '14 at 18:08
  • @hobbs: if you're in control of the processor, you're allowed to modify the protection bits. Otherwise, JIT compilers (like most JavaScript engines) would never work. W^X offers a measure of protection against executing shellcode, which typically has no chance to modify the protection bits. – nneonneo Sep 10 '14 at 18:13
  • @nneonneo nevermind, I didn't bother to scroll the code sample. – hobbs Sep 10 '14 at 18:15

1 Answers1

3

On ARM, you have to flush the instruction cache using a function like cacheflush, or your code may not run properly. This is required for self-modifying code and JIT compilers, but is not generally needed for x86.

Additionally, if you move a chunk of code from one location to another, you have to fixup any relative jumps. Typically, calls to library functions are implemented as jumps to a relocation section, and are often relative.

To avoid having to fixup jumps, you can use some linker tricks to compile code to start at a different offset. Then, when decrypting, you simply load the decrypted code to that offset. A two-stage compilation process is usually used: compile your real code, append the resulting machine code to your decryption stub, and compile the whole program.

nneonneo
  • 171,345
  • 36
  • 312
  • 383