Memcpy with function pointers leads to a segfault

Question

I know I can just copy the function by reference, but I want to understand what's going on in the following code that produces a segfault.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int return0()
{
    return 0;
}

int main()
{
    int (*r0c)(void) = malloc(100);
    memcpy(r0c, return0, 100);
    printf("Address of r0c is: %x\n", r0c);
    printf("copied is: %d\n", (*r0c)());
    return 0;
}

Here's my mental model of what I thought should work.

The process owns the memory allocated to r0c. We are copying the data from the data segment corresponding to return0, and the copy is successful.

I thought that dereferencing a function pointer is the same as calling the data segment that the function pointer points to. If that's the case, then the instruction pointer should move to the data segment corresponding to r0c, which will contain the instructions for function return0. The binary code corresponding to return0 doesn't contain any jumps or function calls that would depend on the address of return0, so it should just return 0 and restore ip... 100 bytes is certainly enough for the function pointer, and 0xc3 is well within the bounds of r0c (it is at byte 11).

So why the segmentation fault? Is this a misunderstanding of the semantics of C's function pointers or is there some security feature that prevents self-modifying code that I'm unaware of?

`printf("Address of r0c is: %x\n", r0c);` is not well defined. — chux - Reinstate Monica, Jul 18 '16 at 20:58
Something tells me that the entire thing is not well defined. — HolyBlackCat, Jul 18 '16 at 20:59
First, functions (code) reside in a memory segment that is marked as executable. Allocated data is not so marked, especially if your system is using DEP (data execution prevention). If you wish to execute code that is in the data segment, you need to figure out how to mark that data as executable. Second, `memcpy(r0c, return0, 100);` is probably copying from beyond the end of of memory. Third, it is probably likely that the memory locations containing code are protected from access. — GreatAndPowerfulOz, Jul 18 '16 at 21:02
Compiling this with gcc with the arguments -Wall -Wpedantic -std=c11 gives several warnings. Listen to your compiler. — Random Davis, Jul 18 '16 at 21:02
`memcpy(r0c, return0, 100);` is a problem as `return0` not convert well to a `void*`. — chux - Reinstate Monica, Jul 18 '16 at 21:06

Some programmer dude · Answer 1 · 2016-07-18T21:14:29.720

The memory pages used by malloc to allocate memory are not marked as executable. You can't copy code to the heap and expect it to run.

If you want to do something like that you have to go deeper into the operating system, and allocate pages yourself. Then you need to mark those as executable. You would most likely need administrator rights to be able to set the executable flag on memory pages.

And it's really dangerous. If you do this in a program you distribute and have some kind of bug that lets an attacker use our program to write to those allocated memory pages, then the attacker can gain administrator rights and take control of the computer.

There's also other problems with your code, like pointers to functions might not translate well into general pointers on all platforms. It's very hard (not to mention non-standard) to predict or otherwise get the size of a function. You also print out pointers wrong in your code example. (use the "%p" format to print a void *, casting the pointer to a void * is needed).

Also when you declare a function like int fun() that's not the same as declaring a function that takes no arguments. If you want to declare a function that takes no arguments you should explicitly use void as in int fun(void).

I was just trying to provide a minimal example...this was just about understanding the semantics of the program. — A.S, Jul 18 '16 at 21:11
Setting pages to executable does not require administrator rights on most operation systems. — Daniel, Jul 18 '16 at 21:22

John Bollinger · Answer 2 · 2016-07-18T21:44:03.743

The standard says:

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.

[C2011, 7.24.2.1/2; emphasis added]

In the standard's terminology, functions are not "objects". The standard does not define behavior for the case where the source pointer points to a function, therefore such a memcpy() call produces undefined behavior.

Additionally, the pointer returned by malloc() is an object pointer. C does not provide for direct conversion of object pointers to function pointers, and it does not provide for objects to be called as functions. It is possible to convert between object pointer and function pointer by means of an intermediate integer value, but the effect of doing so is at minimum doubly implementation-defined. Under some circumstances it is undefined.

As in other cases, UB can turn out to be precisely the behavior you hoped for, but it is not safe to rely on that. In this particular case, other answers present good reasons to not expect to get the behavior you hoped for.

This is all true, but GCC is lax, and it turned out this wasn't causing the segfault. — A.S, Jul 18 '16 at 21:45
@AndrewSalmon, on the contrary -- a segfault *always* arises from implementation-defined or undefined behavior, and I am pointing out exactly which operations give rise to such behavior in your program. It is possible that you can rely on implementation extensions to get the behavior you want, but doing so is inherently non-portable. Nevertheless, that may be acceptable to you. — John Bollinger, Jul 18 '16 at 21:54

score 0 · Answer 3 · answered Jul 18 '16 at 21:38

As was said in some comments, you need to make the data executable. This requires communicating with the OS to change protections on the data. On Linux, this is the system call int mprotect(void* addr, size_t len, int prot) (see http://man7.org/linux/man-pages/man2/mprotect.2.html).

Here is a Windows solution using VirtualProtect.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#ifdef _WIN32
#include <Windows.h>
#endif

int return0()
{
    return 0;
}

int main()
{
    int (*r0c)(void) = malloc(100);
    memcpy((void*) r0c, (void*) return0, 100);
    printf("Address of r0c is: %p\n", (void*) r0c);
#ifdef _WIN32
    long unsigned int out_protect;
    if(!VirtualProtect((void*) r0c, 100, PAGE_EXECUTE_READWRITE, &out_protect)){
        puts("Failed to mark r0c as executable");
        exit(1);
    }
#endif
    printf("copied is: %d\n", (*r0c)());
    return 0;
}

And it works.

score -1 · Answer 4 · answered Jul 18 '16 at 21:17

Malloc returns a pointer to an allocated memory (100 bytes in your case). This memory area is uninitialized; assuming that memory could be executed by the CPU, for your code to work, you would have to fill those 100 bytes with the executable instructions that the function implements (if indeed it can be held in 100 bytes). But as has been pointed out, your allocation is on the heap, not in the text (program) segment and I don't think it can be executed as instructions. Perhaps this would achieve what it is you want:

int return0()
{
    return 0;
}

typedef int (*r0c)(void);

int main(void)
{
    r0c pf = return0;
    printf("Address of r0c is: %x\n", pf);
    printf("copied is: %d\n", pf());
    return 0;
}

I appreciate the answer, but I explained that I know that it's possible to call the function by reference; just that I wanted to know if/how it was possible to execute actual data using function pointers, not just using function pointers to call by reference. — A.S, Jul 18 '16 at 21:24

Memcpy with function pointers leads to a segfault

4 Answers4