Am basically learning how to make my own instruction in the X86 architecture, but to do that I am understanding how they are decoded and and interpreted to a low level language,
By taking an example of a simple mov
instruction and using the .byte
notation I wanted to understand in detail as to how instructions are decoded,
My simple code is as follows:
#include <stdio.h>
#include <iostream>
int main(int argc, char const *argv[])
{
int x{5};
int y{0};
// mov %%eax, %0
asm (".byte 0x8b,0x45,0xf8\n\t" //mov %1, eax
".byte 0x89, 0xC0\n\t"
: "=r" (y)
: "r" (x)
);
printf ("dst value : %d\n", y);
return 0;
}
and when I use objdump
to analyze how it is broken down to machine language, i get the following output:
000000000000078a <main>:
78a: 55 push %ebp
78b: 48 dec %eax
78c: 89 e5 mov %esp,%ebp
78e: 48 dec %eax
78f: 83 ec 20 sub $0x20,%esp
792: 89 7d ec mov %edi,-0x14(%ebp)
795: 48 dec %eax
796: 89 75 e0 mov %esi,-0x20(%ebp)
799: c7 45 f8 05 00 00 00 movl $0x5,-0x8(%ebp)
7a0: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
7a7: 8b 45 f8 mov -0x8(%ebp),%eax
7aa: 8b 45 f8 mov -0x8(%ebp),%eax
7ad: 89 c0 mov %eax,%eax
7af: 89 45 fc mov %eax,-0x4(%ebp)
7b2: 8b 45 fc mov -0x4(%ebp),%eax
7b5: 89 c6 mov %eax,%esi
7b7: 48 dec %eax
7b8: 8d 3d f7 00 00 00 lea 0xf7,%edi
7be: b8 00 00 00 00 mov $0x0,%eax
7c3: e8 78 fe ff ff call 640 <printf@plt>
7c8: b8 00 00 00 00 mov $0x0,%eax
7cd: c9 leave
7ce: c3 ret
With regard to this output of objdump
why is the instruction 7aa: 8b 45 f8 mov -0x8(%ebp),%eax
repeated twice, any reason behind it or am I doing something wrong while using the .byte
notation?