1

I'm understanding the assembly and C code. I have following C program , compiled to generate Object file only.

#include <stdio.h>
int main()
{
  int i = 10;
  int j = 22 + i;
  return 0;
}

I executed following command

objdump -S myprogram.o

Output of above command is:

objdump -S testelf.o 

testelf.o:     file format elf32-i386


Disassembly of section .text:

00000000 <main>:
#include <stdio.h>

int main()
{
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 10                sub    $0x10,%esp
  int i = 10;
   6:   c7 45 f8 0a 00 00 00    movl   $0xa,-0x8(%ebp)
  int j = 22 + i;
   d:   8b 45 f8                mov    -0x8(%ebp),%eax
  10:   83 c0 16                add    $0x16,%eax
  13:   89 45 fc                mov    %eax,-0x4(%ebp)

  return 0;
  16:   b8 00 00 00 00          mov    $0x0,%eax
}
  1b:   c9                      leave  
  1c:   c3                      ret  

What is meant by number numeric before the mnemonic commands i.e. "83 ec 10 " before "sub" command or "c7 45 f8 0a 00 00 00" before "movl" command

I'm using following platform to compile this code:

$ lscpu 
Architecture:          i686
CPU op-mode(s):        32-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
Vendor ID:             GenuineIntel
BSalunke
  • 11,499
  • 8
  • 34
  • 68
  • 3
    Those are the machine code sequences that the assembly instructions were assembled into. See [Intel's manuals](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html) for a full instruction set reference and information on how to map instructions to machine code (and vice versa). – Michael Apr 02 '16 at 07:56
  • Note that, if you want to study the assembly output of a little test program like that, the compiler might optimize useless code away. In this case probably everything because j is never used. – Unimportant Apr 02 '16 at 08:03
  • 1
    Notice that this is [AT&T syntax](http://www.imada.sdu.dk/Courses/DM18/Litteratur/IntelnATT.htm), so it does not correspond directly with the syntax represented in Intel manuals. – Antti Haapala -- Слава Україні Apr 02 '16 at 08:03
  • Basically it is setup that is positioning the stack-pointer at the top of the stack. There are no `argc` or `argv` values, so you likely have 4-bits for argc, 8-bits for argv and a 4-bit align resulting in your `0x10` offset from the top of the stack. . – David C. Rankin Apr 02 '16 at 08:05
  • 2
    The opcodes you can find [here](http://ref.x86asm.net/coder32.html) – Antti Haapala -- Слава Україні Apr 02 '16 at 08:10
  • It is mentioned in the [man page for `objdump`](http://linux.die.net/man/1/objdump): "When disassembling instructions, print the instruction in hex as well as in symbolic form. **This is the default** except when --prefix-addresses is used." – Jongware Apr 02 '16 at 08:34

1 Answers1

1

Those are x86 opcodes. A detailed reference, other than the ones listed in the comments above is available here.

For example the c7 45 f8 0a 00 00 00 before the movl $0xa,-0x8(%ebp) are hexadecimal values for the opcode bytes. They tell the CPU to move the immediate value of 10 decimal (as a 4-byte value) into the address located on the current stack 8-bytes above the stack frame base pointer. That is where the variable i from your C source code is located when your code is running. The top of the stack is at a lower memory address than the bottom of the stack, so moving a negative direction from the base is moving up the stack.

The c7 45 f8 opcodes mean to mov data and clear the arithmetic carry flag in the EFLAGS register. See the reference for more detail.

The remainder of the codes are an immediate value. Since you are using a little endian system, the least significant byte of a number is listed first, such that 10 decimal which is 0x0a in hexadecimal and has a 4-byte value of 0x0000000a is stored as 0a 00 00 00.

T Johnson
  • 824
  • 1
  • 5
  • 10