I assume truncation is required, same as if one writes i = (int)f
in "C".
If you have SSE3, you can use:
int convert(float x)
{
int n;
__asm {
fld x
fisttp n // the extra 't' means truncate
}
return n;
}
Alternately, with SSE2 (or in x64 where inline assembly might not be available), you can use almost as fast:
#include <xmmintrin.h>
int convert(float x)
{
return _mm_cvtt_ss2si(_mm_load_ss(&x)); // extra 't' means truncate
}
On older computers there is an option to set the rounding mode manually and perform conversion using the ordinary fistp
instruction. That will probably only work for arrays of floats, otherwise care must be taken to not use any constructs that would make the compiler change rounding mode (such as casting). It is done like this:
void Set_Trunc()
{
// cw is a 16-bit register [_ _ _ ic rc1 rc0 pc1 pc0 iem _ pm um om zm dm im]
__asm {
push ax // use stack to store the control word
fnstcw word ptr [esp]
fwait // needed to make sure the control word is there
mov ax, word ptr [esp] // or pop ax ...
or ax, 0xc00 // set both rc bits (alternately "or ah, 0xc")
mov word ptr [esp], ax // ... and push ax
fldcw word ptr [esp]
pop ax
}
}
void convertArray(int *dest, const float *src, int n)
{
Set_Trunc();
__asm {
mov eax, src
mov edx, dest
mov ecx, n // load loop variables
cmp ecx, 0
je bottom // handle zero-length arrays
top:
fld dword ptr [eax]
fistp dword ptr [edx]
loop top // decrement ecx, jump to top
bottom:
}
}
Note that the inline assembly only works with Microsoft's Visual Studio compilers (and maybe Borland), it would have to be rewritten to GNU assembly in order to compile with gcc.
The SSE2 solution with intrinsics should be quite portable, however.
Other rounding modes are possible by different SSE2 intrinsics or by manually setting the FPU control word to a different rounding mode.