1

I'm writing a little program that traces another one and lists the functions call (Near call, far call, dynamic linking etc).

The goal is to generate a callgraph thanks to the dotty framework.

I'm currently struggling with the decoding of the opcodes given by ptrace(PEEK_TEXT).

I've read the official intel documentation (http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.html)

And a dozen of other manuals.

There are the points I don't get:

1) When I search for the 0xE8 opcode in the word given by ptrace(PEEK_TEXT) I found a lot of items, too much I believe. For instance "objdump -d | grep call" give me:

400450: e8 4b 00 00 00          callq  4004a0 <__gmon_start__@plt>
4004d4: e8 b7 ff ff ff          callq  400490 <__libc_start_main@plt>
40055d: e8 7e ff ff ff          callq  4004e0 <deregister_tm_clones>
40058d: ff d0                   callq  *%rax
4005ae: e8 cd fe ff ff          callq  400480 <printf@plt>
4005b3: e8 b8 fe ff ff          callq  400470 <strlen@plt>
4005b8: e8 c3 fe ff ff          callq  400480 <printf@plt>
4005c9: 0f 05                   syscall 
4005fe: e8 3d fe ff ff          callq  400440 <_init>
400619: 41 ff 14 dc             callq  *(%r12,%rbx,8)

So my program should list the same number of call ? Right ? But instead I find hundreds of calls... I find the rights ones, the ones listed above but others things too. So I'm thinking, those aren't actually prefix with the value 0xE8 ? Are they ? If it's the case, how can I differentiate an opcode 0xE8 from a prefix which has the value 0xE8 ?

2) When the offset following the opcode 0xE8 is negative the only solution I found is to memset a buffer with 0xff and then copy in it to force to bit sign to 1. It there another way ?

3) Someone could explain one more time the Mod R/M and how to decode it ?

4) Also, how to interpret a Rex prefix of value 0x41 ? How can I know if a 0x41 0xE8 is a call with a prefix instead of another opcode with random value ?

Sorry if I'm unclear, those topics are interesting but I'm kind of lost in there

Jean Jung
  • 1,200
  • 12
  • 29
Thibaud Auzou
  • 129
  • 10
  • Sorry I haven't quite get what it is happening here, are you ptracing a process and reading its whole code section to find the call instructions? Are you doing this by looking for the opcodes of the (various form of) calls? Are you single stepping the program? Because you can't look for opcodes signatures inside a code section without doing a recursive descent disassembler. –  Jul 09 '15 at 21:14
  • 2
    Not every `0xE8` byte is a `call` instruction, you need to make sure it is the opcode byte. For example, `mov al, 0xE8` is an entirely different instruction which happens to have machine code `0xB0 0xE8` so it does have a `0xE8` in it, but that's not the opcode, it's the operand. – Jester Jul 09 '15 at 21:44
  • Thx for your answers. I use ptrace singlestep on a binary that I launched myself (execve) or on a program already launched (ptrace TRACE_ME). On every ptrace(SINGLESTEP) I use ptrace(PEEK_TEXT) and I perform some bitwise operation to detect an opcode on the first byte of the word given by ptrace. Bu, as you said, sometimes it comes back positive althought it's not the 0xE8 opcode but a random data. Isn't the first byte of peek_text always the an opcode ? – Thibaud Auzou Jul 09 '15 at 21:59
  • If you supply the current `RIP` as address, it should be the start of the instruction, yes. If that is a `0xE8` then it should be a `call`. Don't understand your question `2)`. For `3)` read the manual chapter 2.1.5. For `4)` the `call` doesn't need a rex prefix (although it may be legal). – Jester Jul 10 '15 at 00:00

0 Answers0