I'm writing a little program that traces another one and lists the functions call (Near call, far call, dynamic linking etc).
The goal is to generate a callgraph thanks to the dotty framework.
I'm currently struggling with the decoding of the opcodes given by ptrace(PEEK_TEXT).
I've read the official intel documentation (http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.html)
And a dozen of other manuals.
There are the points I don't get:
1) When I search for the 0xE8 opcode in the word given by ptrace(PEEK_TEXT) I found a lot of items, too much I believe. For instance "objdump -d | grep call" give me:
400450: e8 4b 00 00 00 callq 4004a0 <__gmon_start__@plt>
4004d4: e8 b7 ff ff ff callq 400490 <__libc_start_main@plt>
40055d: e8 7e ff ff ff callq 4004e0 <deregister_tm_clones>
40058d: ff d0 callq *%rax
4005ae: e8 cd fe ff ff callq 400480 <printf@plt>
4005b3: e8 b8 fe ff ff callq 400470 <strlen@plt>
4005b8: e8 c3 fe ff ff callq 400480 <printf@plt>
4005c9: 0f 05 syscall
4005fe: e8 3d fe ff ff callq 400440 <_init>
400619: 41 ff 14 dc callq *(%r12,%rbx,8)
So my program should list the same number of call ? Right ? But instead I find hundreds of calls... I find the rights ones, the ones listed above but others things too. So I'm thinking, those aren't actually prefix with the value 0xE8 ? Are they ? If it's the case, how can I differentiate an opcode 0xE8 from a prefix which has the value 0xE8 ?
2) When the offset following the opcode 0xE8 is negative the only solution I found is to memset a buffer with 0xff and then copy in it to force to bit sign to 1. It there another way ?
3) Someone could explain one more time the Mod R/M and how to decode it ?
4) Also, how to interpret a Rex prefix of value 0x41 ? How can I know if a 0x41 0xE8 is a call with a prefix instead of another opcode with random value ?
Sorry if I'm unclear, those topics are interesting but I'm kind of lost in there