0

Now I encounter problem. My project works on assembly-level. So I need assembly-level programming but the project scale is too large to do work on assembly-level only. Because of this issue, I determine get hexcodes from c-source-file made by gcc. but how? How can I get certain function's HEXCODES by using gcc?

I have a idea,

int function_name(){
    int a=1;
    return a;
}
write(fd, (char *)function_name, sizeof(function_name))

I would get hexcodes of function_name after this doing. but It's not good way to solve this problem, It would make me need to handle many file when I need many function as target.

Is there other good way to solve this problem? I think ideal solution that need only function name(and output file name if needed) and works on command-line. Is ideal solution which I think impossible?

Also I assume that compiler's optimization options is off so I would get hexcodes from function_name is '\x55\x8B\xEC\x83\xEC\x04\xC7\x45\xFC\x01\x00\x00\x00\x8B\x45\xFC\x8B\xE5\x5D\xC3' function_name's assembly code is below.

PUSH EBP
MOV EBP, ESP
SUB ESP, 4
MOV DWORD PTR[EBP-4], 1
MOV EAX, DWORD PTR[EBP-4]
MOV ESP, EBP
POP EBP
RETN
pyutic
  • 17
  • 4

3 Answers3

2

gcc produces an assembly file from every source file as part of the compilation toolchain. That file is usually temporary, and as such is immediately deleted. If you want it to be saved as myfile.s, you can use this command:

gcc -S -o myfile.s myfile.c
lodo
  • 2,314
  • 19
  • 31
  • Oh... I'm sorry... I have mistake on my questions. I mean assembly code as binary codes(opcodes)... Really sorry, Thanks to your answer – pyutic Mar 30 '15 at 07:45
  • @pyutic Your compiler already produces machine code. It's unclear what you mean by "get function assembly code". You can find the file offset start and end location with objdump or readelf of a function, and a view the executable/object file in a hex editor if you want - but perhaps that's not what you actually want – nos Mar 30 '15 at 09:13
  • 1
    Actually, better try `gcc -O -fverbose-asm -S myfile.c` – Basile Starynkevitch Mar 30 '15 at 09:21
1

Try

objdump -D -Mintel yourfile.o

the dump will look like (.O generated by free pascal compiler, but gcc will be mostly the same)

   0:   55                      push   ebp
   1:   89 e5                   mov    ebp,esp
   3:   8d 64 24 ec             lea    esp,[esp-0x14]
   7:   53                      push   ebx
   8:   89 45 fc                mov    DWORD PTR [ebp-0x4],eax
   b:   c7 45 f4 00 00 00 00    mov    DWORD PTR [ebp-0xc],0x0
  12:   31 c0                   xor    eax,eax
  14:   68 00 00 00 00          push   0x0
  19:   55                      push   ebp
  1a:   68 00 00 00 00          push   0x0
  1f:   64 ff 30                push   DWORD PTR fs:[eax]
  22:   64 89 20                mov    DWORD PTR fs:[eax],esp
  25:   c7 45 f8 00 00 00 00    mov    DWORD PTR [ebp-0x8],0x0
  2c:   8b 45 fc                mov    eax,DWORD PTR [ebp-0x4]
  2f:   8b 50 04                mov    edx,DWORD PTR [eax+0x4] 

The code is still relocatable, so references will be zero bytes.

To resolve that you need to objdump the binary (.exe) instead of the .o, but such output is usually huge, and will make finding a specific location harder.

Marco van de Voort
  • 25,628
  • 5
  • 56
  • 89
0

I guess you are on Linux, since you are using gcc

You might run

 gcc -Wall -O -c myfile.c

to get an object file myfile.o from your source C file myfile.c ; that object file is in ELF so contains notably binary code and relocation orders. You could parse that ELF object file (e.g. with commands like objdump(1) or readelf or thru some library like libelf or libbfd)

Alternatively work only on ELF shared objects with position independent code and use dlopen(3). See program library howto

Notice that not every source-level C function correspond to some function (e.g. ELF symbol) inside of object files (e.g. because of static functions - their name can be forgotten or stripped, or because of inline functions - they don't have their own machine code, it has been inlined in the caller). Assume an optimizing compiler (e.g. gcc -O2).

Remember that decompilation is an impossible task in general. Be aware of the halting problem which is undecidable.

Look also at this question and the answers about libopcode

BTW,

write(fd, (char *)function_name, sizeof(function_name))

won't compile (you cannot take the sizeof some function). Maybe you would do

write(fd, (char*)function_name, sizeof(char*))

which will write some address, which might not mean much (be aware of ASLR).

Maybe you want dladdr(3)? You probably would need to compile your program with the -rdynamic option passed at linking time.

Community
  • 1
  • 1
Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547