1

In the development of my BSc thesis (a rootkit for the 5.4 Linux Kernel), I found myself having to identify a function address (i.e., the address of do_syscall_64()) in memory. I don't know it in advance 'cause there is KASLR. What I'm doing is:

  • retrieve the system call handler via MSRs;
  • scan the memory location starting from the base address of entry_SYSCALL_64, which is the system call handler's code block, until I find the actual call to do_syscall_64();
  • isolate 4 bytes after the opcode (i.e., e8), that is the offset to which the execution flow will jump after the call:
e8 c4 bd f8 ff  call   0xffffffff81b8be40 <do_syscall_64>

So, what should I do with the hex offset retrieved?
I found out that the address specified after this call instruction is an offset from the base code segment. Do I need to convert the offset into decimal and add it to the base code segment address?
Thanks in advance.

migliio
  • 21
  • 4
  • This seems to be a question about understanding machine code. The linux kernel is irrelevant to the question, is that right? – user253751 Mar 17 '21 at 11:28
  • For analysing assembler instrucions it would greatly help to know the CPU architecture. – Gerhardh Mar 17 '21 at 11:35
  • @user253751 yes, you're right – migliio Mar 17 '21 at 12:09
  • 2
    If I understand you correctly, you want to get the absolute address from this? https://www.felixcloutier.com/x86/call <- E8 + 32 bit x -> x is sign extended to 64-bits and added to the address where the instruction is. –  Mar 17 '21 at 12:17
  • 2
    The signed 32-bit offset needs to be sign-extended to 64 bits and added to the address after the current instruction, so the called function will be at the address of the e8 opcode plus 5 plus the sign-extended offset. – Ian Abbott Mar 17 '21 at 13:41
  • 2
    Yup, the 4-byte rel32 offset is a native (little) endian `int32_t` relative to the *end* of the 5-byte `call rel32` instruction, like in [How does $ work in NASM, exactly?](https://stackoverflow.com/q/47494744) (manually encoding a `call rel32` with NASM). So yeah, like @IanAbbott said, `call_addr + 5 + rel32`, where call_addr is a `char*` or `uintptr_t`. If you compile your code with standard kernel options (including `-fno-strict-aliasing`), you can just load it like `int32_t rel32 = *(int32_t*)( e8_call_addr + 1 );`. – Peter Cordes Mar 17 '21 at 14:43

0 Answers0