I think my question might seem a bit odd, but here it goes; I'm trying to create a program dynamically in C++ (mostly for the fun of it, but also for a programmatic reason) and it is not so hard as it might sound. To do this you have to use assembly in runtime like this:
byte * buffer = new byte[5];
*buffer = '0xE9'; // Code for 'jmp'
*(uint*)(buffer + 1) = 'address destination'; // Address to jump to
This is much easier than it might seem, because I target only one platform and compiler; GCC with Linux 32bit (and also only one calling convention, cdecl). So I'm trying to create a dynamic assembly function to redirect calls from triggers, so I can use class methods as callbacks (even with C API libraries (with cdecl of course)). I only need this to support pointers and native types (char, int, short etc...).
ANYTHING MyRedirect(ANY AMOUNT ARGUMENTS)
{
return MyClassFunc('this', ANY AMOUNT ARGUMENTS);
}
The function above, is the one I want to create in pure assembly (in memory with C++). Since the function is very simple, its ASM is simple as well (depending on arguments).
55 push %ebp
89 e5 mov %esp,%ebp
83 ec 04 sub $0x4,%esp
8b 45 08 mov 0x8(%ebp),%eax
89 04 24 mov %eax,(%esp)
e8 00 00 00 00 call <address>
c9 leave
c3 ret
So in my program, I have created an ASM pattern generator (since I don't know ASM especially well, I search for patterns). This function can generate assembly code (in bytes, for the exact case above, i.e a function that redirects and returns) by specifying the amount of arguments the function needs. This is a snippet from my C++ code.
std::vector<byte> detourFunc(10 + stackSize, 0x90); // Base is 10 bytes + argument size
// This becomes 'push %ebp; move %esp, %ebp'
detourFunc.push_back(0x55); // push %ebp
detourFunc.push_back(0x89); // mov
detourFunc.push_back(0xE5); // %esp, %ebp
// Check for arguments
if(stackSize != 0)
{
detourFunc.push_back(0x83); // sub
detourFunc.push_back(0xEC); // %esp
detourFunc.push_back(stackSize); // stack size required
// If there are arguments, we want to push them
// in the opposite direction (cdecl convention)
for(int i = (argumentCount - 1); i >= 0; i--)
{
// This is what I'm trying to implement
// ...
}
// Check if we need to add 'this'
if(m_callbackClassPtr)
{
}
}
// This is our call operator
detourFunc.push_back(0xE8); // call
// All nop, this will be replaced by an address
detourFunc.push_back(0x90); // nop
detourFunc.push_back(0x90); // nop
detourFunc.push_back(0x90); // nop
detourFunc.push_back(0x90); // nop
if(stackSize == 0)
{
// In case of no arguments, just 'pop'
detourFunc.push_back(0x5D); // pop %ebp
}
else
{
// Use 'leave' if we have arguments
detourFunc.push_back(0xC9); // leave
}
// Return function
detourFunc.push_back(0xC3); // ret
If I specify zero as the stackSize
this will be the output:
55 push %ebp
89 e5 mov %esp,%ebp
e8 90 90 90 90 call <address>
5d pop %ebp
c3 ret
As you can see, this is completely valid 32-bit ASM, and will act as the 'MyRedirect' if it had zero arguments and no need for a 'this' pointer. The problem is, I want to implement the part where it generates ASM code, depending on the amount of arguments I specify that the 'redirect' function will receive. I have successfully done this in my little C++ program of mine (cracked the pattern).
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char * argv[])
{
int val = atoi(argv[1]);
printf("\tpush %%ebp\n");
printf("\tmov %%esp,%%ebp\n");
if(val == 0)
{
printf("\tcall <address>\n");
printf("\tpop %%ebp\n");
}
else
{
printf("\tsub $0x%x,%%esp\n", val * sizeof(int));
for(int i = val; i > 0; i--)
{
printf("\tmov 0x%x(%%ebp),%%eax\n", i * sizeof(int) + sizeof(int));
printf("\tmov %%eax,0x%x(%%esp)\n", i * sizeof(int) - sizeof(int));
}
printf("\tcall <address>\n");
printf("\tleave\n");
}
printf("\tret\n");
return 0;
}
This function prints out the exact same pattern as the ASM code generate by 'objdump'. So my question is; will this be valid in all cases if I only want a redirect function as the one above, no matter the arguments, if it is only under Linux 32bit, or are there any pitfalls I need to know about? For example; would the generated ASM be different with 'shorts' or 'chars' or will this work (I've only tested with integers), and also if I call a function which returns 'void' (how would that affect the ASM)?
I might have explained everything a bit fuzzy, so please ask instead of any misunderstandings :)
NOTE: I do not want to know alternatives, I enjoy my current implementation and think it's a very interesting one, I would just highly appreciate your help on the subject.
EDIT: In case of interest, here are some dumps for the above C++ code: link