SSE2 Instruction, PMULUDQ Multiplication Question

Question

In the code I am debugging, there's an assembly instruction as shown below:

pmuludq xmm6, xmm1

xmm6 = 0x3736353433323130
xmm1 = 0x7D35343332313938

If I multiply the above 2 numbers using Python, I get the result as shown below:

>>> hex(0x3736353433323130 * 0x7D35343332313938)
'0x1b00f1758e3c83508a9f69982a1e7280L'

However, when I am debugging the code, the value of xmm6 register after the multiply operation is: 0x0A09A5A82A1E7280

Why is the result different? And how can I simulate this instruction using Python?

Python uses arbitrary precision integers, that is, there's never any overflow. You'd need to handle the overflow case yourself. — Collin, Dec 03 '18 at 02:32

Peter Cordes · Accepted Answer · 2018-12-03T02:49:53.137

2

look at the Operation section in the manual for pseudocode: http://felixcloutier.com/x86/PMULUDQ.html.

It does two 32x32 => 64 (dword x dword => qword) multiplies, one in each half of the 16-byte register. (It ignores the odd dword elements of the inputs). You only showed 16 hex digits for the inputs, so I think you're only looking at the low qword of the input registers.

If you only care about the low 64 bits, then the equivalent operation is simply

result = (a & 0xFFFFFFFF) * (b & 0xFFFFFFFF)

It repeats the same thing for the high 64 bits.

edited Dec 03 '18 at 02:49

answered Dec 03 '18 at 02:32

Peter Cordes

328,167
45
605
847

Do you mean that I showed only 16 hex digits for xmm6 and xmm1 registers? The binary is using movq operation to move a QWORD from a memory address into xmm1 and xmm6 registers. I think, that's why the lower QWORD for XMM registers is 0. – Neon Flash Dec 03 '18 at 02:40
@NeonFlash: yes, edited to clarify. And yes, using `movq` does zero-extend, filling the high half of the XMM register with zero. (And `pmuludq` doesn't change that.) Usually not much point using `pmuludq` for scalar operations when you can do scalar `imul`, like `mov ecx, [rdi]` / `mov eax, [rsi]` / `imul rcx, rax`. – Peter Cordes Dec 03 '18 at 02:52

SSE2 Instruction, PMULUDQ Multiplication Question

1 Answers1