The 66
is in the instruction to differentiate it from the MMX version MASKMOVQ
. The 66
doesn't cancel the 67
, just add it in the beginning. Note that the VEX encoded version doesn't even have the 66 0F
, since those prefixes are embedded in the VEX itself, see section 2.3.1 Instruction Format:
Elimination of escape opcode byte (0FH), SIMD prefix byte (66H, F2H,
F3H) via a compact bit field representation within the VEX prefix.
Also, section 2.3.5 The VEX Prefix:
Compaction of SIMD prefix: Legacy SSE instructions effectively use
SIMD prefixes (66H, F2H, F3H) as an opcode extension field. VEX prefix
encoding allows the functional capability of such legacy SSE
instructions (operating on XMM registers, bits 255:128 of
corresponding YMM unmodified) to be encoded using the VEX.pp field
without the presence of any SIMD prefix. The VEX-encoded 128-bit
instruction will zero-out bits 255:128 of the destination register.
VEX-encoded instruction may have 128 bit vector length or 256 bits
length.
Compaction of two-byte and three-byte opcode: More recently introduced
legacy SSE instructions employ two and three-byte opcode. The one or
two leading bytes are: 0FH, and 0FH 3AH/0FH 38H. The one-byte escape
(0FH) and two-byte escape (0FH 3AH, 0FH 38H) can also be interpreted
as an opcode extension field. The VEX.mmmmm field provides compaction
to allow many legacy instruction to be encoded without the constant
byte sequence, 0FH, 0FH 3AH, 0FH 38H. These VEX-encoded instruction
may have 128 bit vector length or 256 bits length.