5

From the Intel Software Developer Manual (referred to as ISDM in this post) and the x86 Instruction Set Reference (which, I assume, is just a copy of the former), we know that the mov instruction can move data from eax/ax/al to a memory offset and vice versa.

For example, mov moffs8, al moves the contents of the al register to some 8-bit memory offset moffs8.

Now, what is moffs8? Quoting the ISDM (3.1.1.3):

moffs8, moffs16, moffs32, moffs64 — A simple memory variable (memory offset) of type byte, word, or doubleword used by some variants of the MOV instruction. The actual address is given by a simple offset relative to the segment base. No ModR/M byte is used in the instruction. The number shown with moffs indicates its size, which is determined by the address-size attribute of the instruction.

I emphasised the sentences saying that moffs8 is of type byte and is 8 bits in size.

I'm a beginner in assembly, so, immediately after having read this, I started playing around with the mov moffs8, al instruction using NASM. Here's the code I've written:

; File name: mov_8_bit_al.s
USE32

section .text
    mov BYTE [data], al

section .bss
    data resb 2

This is what nasm -f bin mov_8_bit_al.s produced (in hex):

A2 08 00 00 00

Here's how I understand this:

  • A2 is the opcode for MOV moffs8, AL
  • 08 is the memory offset itself, of size 1 byte
  • 00 00 00 is some garbage

It looks like 08 00 00 00 is the memory offset, but in this case, it's a moffs32, not moffs8! So, the CPU will read only one byte while executing A2, and treat 00 as an ADD instruction or something else, which was not intended.

At the moment, it seems to me that NASM is generating invalid byte code here, but I guess it's me who's misunderstood something... Maybe NASM doesn't follow IDSM? If so, its code wouldn't be executed properly on Intel CPUs, so it should be following it!

Can you please explain where I'm wrong?

ForceBru
  • 43,482
  • 10
  • 63
  • 98

1 Answers1

7

The size suffix after moffs actually refers to the operand size, not the size of the address itself. This mirrors the meaning of the size suffix after r/m.

The manual actually says so in a note:

NOTES:
* The moffs8, moffs16, moffs32 and moffs64 operands specify a simple offset relative to the segment base, where 8, 16, 32 and 64 refer to the size of the data. The address-size attribute of the instruction determines the size of the offset, either 16, 32 or 64 bits.

harold
  • 61,398
  • 6
  • 86
  • 164
  • Where do I get the address-size attribute of an instruction? – ForceBru Sep 18 '17 at 15:47
  • @ForceBru it depends on the mode and presence of the address size override prefix – harold Sep 18 '17 at 15:50
  • ah, so, if I'm running in 32-bit mode and there's no `0x67` prefix, the addresses would occupy 4 bytes, right? – ForceBru Sep 18 '17 at 15:53
  • 1
    @ForceBru usually an assembly programmer doesn't need to take care of memory operand encoding (unless you are coding for size, like 256 bytes intros), the NASM picks up optimal (size) encoding of instruction, when some ambiguity exist, so most of the time you just write the symbolic source like `mov [some_memory],al` and deal only with the 8 bit size of data in head, leaving the encoding up to the assembler. If you are coding for size, then checking the machine code and trying out multiple variants of code is common, unless you can memorize all that stuff from ISDM (I can't :/ ). – Ped7g Sep 18 '17 at 19:51