4

In my venture of coding a disassembler for the 32-bit Linux on x86 platform, I came across an issue. I saw the following opcode sequence when I disassembled a simple ELF-32 executable using objdump:

dc 82 04 08 0d 00     faddl  0xd0804(%edx)

But when I look at the Intel manual, I don't see an opcode corresponding to this. The fadd instruction starts with 0xDC, but then it requires a m64fp operand, which is "A memory quadword operand in memory."

Now, does this mean that the operand is a 64-bit address (which then means that the fadd instruction is a 64-bit instruction, but isn't prefixed by a REX byte), or is it just a 32-bit address which points to a quadword (64-bit)?

Am I missing something trivial over here, or is my understanding of encoding x86 instructions wrong?

phuclv
  • 37,963
  • 15
  • 156
  • 475
Hrishikesh Murali
  • 535
  • 3
  • 7
  • 16

2 Answers2

5

Let's break this down.

> dc 82 04 08 0d 00     faddl  0xd0804(%edx)
  |  |  \____ ____/
  |  |       V
  |  |       |
  |  |       +---------> 32-bit displacement
  |  +-----------------> ModRM byte
  +--------------------> Opcode

Looking at the docs in detail, dc is indeed for an m64real floating point argument as the source. It will add this 64-bit argument to the ST(0) floating point register.

However, it's the second byte 82 that is deciding where that 64-bit value comes from. This translates to the binary ModRM byte of:

+---+---+---+---+---+---+---+---+
| 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
+---+---+---+---+---+---+---+---+
|  MOD  |  REG/OPCD |    R/M    |

If you look at table 2.2 in your linked document (the one for 32-bit addressing modes), you'll see that this translates into disp32[EDX].

In other words it takes the next 32 bits (four bytes), adds that to the edx register and uses that address to extract the 64-bit value from memory.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
2

"Quadword operand in memory" means the value takes 64 bits in RAM. The address size will depend on whether it is being compiled as a 32 or 64 bit process, not on how big the operands are. Here is a full breakdown of the disassembly.

  • The first byte, DC is the opcode. Combined with the fact that the next byte is not between C0 and C7, and contains 0 in the register field (bits 3-5), this indicates a fadd instruction with a 64 bit memory operand. Interestingly, the l at the end of the opcode would indicate a 32 bit operand. It should be faddq.

  • The second byte contains 3 fields.

    • Bits 6-7 are indicate the mode of the last field.
    • Bits 3-5 are the register field. Since a register operand isn't necessary for this instruction, they are used as part of the opcode.
    • Bits 0-2 are the R/M field. It can hold a register or specify a memory operand. The combined mode 10 and R/M 010 indicate that the operand is a memory operand with a 32 bit address relative to the edx register.
  • The last 4 bytes are the relative offset of the operand in little endian (least significant byte first).

ughoavgfhw
  • 39,734
  • 6
  • 101
  • 123
  • Thanks, I'll first read the docs in detail. I guess I'm trying to jump steps, which ain't working very well. – Hrishikesh Murali Nov 15 '11 at 06:16
  • Perhaps the `faddl` applies to he 32-bit displacement operand rather than the 64-bit quadword it points to? – paxdiablo Nov 15 '11 at 06:22
  • @paxdiablo No, instruction suffixes always refer to operand size. You can tell the size of the displacement by what range it fits in (0xd0804 fits in 32 bits, but not 8), but it isn't explicitly stated because it usually isn't important. – ughoavgfhw Nov 15 '11 at 18:03