1

Hi I am new to assembly language and I am getting confused about the syntax of lea instruction I have seen when I study a piece of code ( which is generated by using gdb command : disassemble main ).

lea    0xa8e96(%rip),%rsi        # 0x4aa5df

The syntax I have seen for lea is

lea src, dest 

But it seems like there is an additional immediate value ( # 0x4aa5df ) following the %rsi register, how should I interpret this correctly?

Edit: I have checked the value stored in the %rip register which is

(gdb) p /x $rip 
$1 = 0x401730

So adding this with the 0xa8e96 gives me 0x4AA5C6 which does not match 0x4aa5df, am I missing something here ?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • 1
    That is just a comment by your friendly disassembler. It tells you what the calculated address is (`rip + 0xa8e96 = 0x4aa5df`). It's not part of the instruction. – Jester Jun 03 '20 at 17:24
  • Hi, Thank you for the reply. I checked the values stored in the %rip register but unfortunately the results does not match and I am not sure why.. – Zhezhong Jiang Jun 03 '20 at 17:52
  • 3
    You want to use the address of the `lea` instruction for what the processor will take as `%rip` during that instruction's execution. You don't want any old value in `%rip` as `%rip` changes at every instruction -- it would only make sense if you are currently stopped at the `lea`, in which case the value will be the address of the `lea` instruction.. – Erik Eidt Jun 03 '20 at 17:56
  • Just in general for x86, the operands can get rather fancy with respect to calculating an address. The general syntax is `base(rb, re, n)` (where you can leave out some of these fields if they arent needed), where the calculated address is `base + rb + n * re` where `base` and `n` are immediate values and `rb` and re` are values from those registers. – Unn Jun 03 '20 at 18:20
  • 5
    @ErikEidt actually `rip` already points past the `lea` so the address of the next instruction should be used. – Jester Jun 03 '20 at 18:51

1 Answers1

2

Thanks for the help from Jester, Unn and Erik. The original C code I used is :

#include <stdio.h>

int main(int argc, char** argv)
{    
    int ret = printf("%s\n", argv[argc-1]);
    argv[0] = '\0'; // NOOP to force gcc to generate a callq instead of jmp
    return ret;
}

And the assembler code generated by using gdb is :

(gdb) disassemble main
Dump of assembler code for function main:
=> 0x0000000000401730 <+0>:     endbr64
   0x0000000000401734 <+4>:     push   %rbx
   0x0000000000401735 <+5>:     movslq %edi,%rdi
   0x0000000000401738 <+8>:     mov    %rsi,%rbx
   0x000000000040173b <+11>:    xor    %eax,%eax
   0x000000000040173d <+13>:    mov    -0x8(%rsi,%rdi,8),%rdx
   0x0000000000401742 <+18>:    lea    0xa8e96(%rip),%rsi        # 0x4aa5df
   0x0000000000401749 <+25>:    mov    $0x1,%edi
   0x000000000040174e <+30>:    callq  0x44bbe0 <__printf_chk>
   0x0000000000401753 <+35>:    movq   $0x0,(%rbx)
   0x000000000040175a <+42>:    pop    %rbx
   0x000000000040175b <+43>:    retq
End of assembler dump.

So the rip does point past the lea instructions and the address should be used in the computation is 0x0000000000401749 , adding this to 0xa8e96 gives the address in the comment # 0x4aa5df.