What is the effect of the displacement value on the Mod field of the ModRegRm byte?

Question

I'm writing an 8086 assembler that takes instructions and produce 8086 machine code. I use the "Intel 8086 User Manual" as a reference.

To make it clear I will explain the situation. Let's say I wanna assemble this instruction mov ax, bx. I will look up the manual to find that when the operands of mov are 2 16bit registers, the opcode for mov is 0x89 and to specify the operands (the source and the destination), mov, in this case, is followed by a ModRegRm byte that specifies the source and the destination, which, in this case, is 0xd8. This byte in binary = 11011000.

The Mod is 2 bits and the Reg, Rm are 3 bits each. so, Mod = 11, Reg = 011, Rm = 000. It's straight forward here, but there is something i don't understand, which is the addressing modes and the displacement.

Look at the table and the three following instructions and their machine code.

mov [bx+0x6], ax ;894706

mov [bx+0xbf],ax ;8987BF00

mov [bx+0xffff],ax ;8947FF

Am I wrong in assuming that the displacement length in each instruction is 8bit, 8bit, 16bit, respectively?

I think I'm right because it's obvious, 0x6 and 0xbf are 1 byte and 0xffff is two bytes.

The question is, why the MOD field in the second instruction is 10b or 0x02 instead of 01b or 0x01? It should be 0x01 because the displacement is 8bit displacement, isn't it? And why the MOD is 0x01 in the third instruction even though the displacement is 16bit? and why the assembler ignored the rest of the displacement and captured only 1 byte?

to represent 0x00BF instead of 0xFFBF you need 16 bits (+D16). To represent -1 0xFFFF you only need 0xFF (+D8) — old_timer, May 10 '20 at 06:14
the manual I am looking at shows that mod 01 is disp-lo sign extended. so as mentioned above 0xBF would become 0xFFBF which is not what you want you wanted 0x00BF, but 0xFF becomes 0xFFFF when sign extended so mod 01 can be used, for 00BF mod 01 cant be used mod 10 must be used. Do you see that in your manual? — old_timer, May 10 '20 at 06:23
google 210200_iAPX88_Book_1981.pdf bitsavers has a copy, semi searchable even though it is a scanned document. I believe it was later than this they split it into separate hardware and software manuals if you will, I have paper copies somewhere, dont remember the names on the covers off hand. This one has the answer to your question though (and the one I use when referring to the instruction set). — old_timer, May 10 '20 at 06:32
Intel's current vol.2 manual (https://software.intel.com/en-us/articles/intel-sdm) has a modern PDF version of this table for 16-bit addressing modes. It might explain it better if for some reason the manual you were reading wasn't clear that displacements are *sign*-extended to 16-bit. — Peter Cordes, May 10 '20 at 10:10

score 5 · Accepted Answer · answered May 10 '20 at 10:51

The size of the displacement depends on the "MOD" field (e.g. 8 bits if MOD=001b, 16 bits if MOD=010b) and is sign extended to the intended size.

This means that an instruction like mov [bx+6], ax could be encoded as mov [bx+0x0006], ax (with a MOD=010b and a 16 bit displacement) or it could be encoded as mov [bx+0x06], ax (with a MOD=001b and a 8 bit displacement).

In the same way, mov [bx+65535],ax could be encoded either way (with 8 bit displacement or 16 bit displacement); because 0xFF can be sign extended to 0xFFFF.

However; mov [bx+191],ax can't be encoded as an 8 bit displacement, because when 191 (0xBF) is sign extended it becomes 0xFFBF, which is not equal to 191. It must use a 16 bit displacement.

Essentially; if the highest 9 bits of the full 16-bit displacement are all the same (all clear for values 0x0000 to 0x007F, or all set for values 0xFF80 to 0xFFFF) it can be encoded as an 8-bit displacement or a 16-bit displacement; otherwise it must use a 16-bit displacement.

When there's a choice between different encodings; a good assembler will choose the smallest possibility (and use an 8 bit displacement because it makes the instruction 1 byte shorter). An even better assembler may use the larger version if it avoids the need for padding (if following instructions need to be aligned on a certain boundary). For an example consider .align 2 then mov [bx+6], ax then .align 2 then clc - with the smaller (3 byte) mov you have to insert an extra nop instruction as padding before the clc to ensure that instruction is aligned on a 2-byte boundary (requested by the align 2 directive), and with the larger (4 byte) mov you don't (so it's 1 less instruction, but the same number of bytes for the resulting code).

What is the effect of the displacement value on the Mod field of the ModRegRm byte?

1 Answers1