__m256 dst = _mm256_cmp_ps(value1, value2, _CMP_LE_OQ);
If dst
is [0,0,0,-nan, 0,0,0,-nan];
I want to be able to know the first -nan
index, in this case 3
without doing a for loop with 8
iterations.
Is this possible?
__m256 dst = _mm256_cmp_ps(value1, value2, _CMP_LE_OQ);
If dst
is [0,0,0,-nan, 0,0,0,-nan];
I want to be able to know the first -nan
index, in this case 3
without doing a for loop with 8
iterations.
Is this possible?
I would movmskps
the result of the comparison and then do a bitscan forward.
Using intrinsics (this works with gcc/clang, see here for alternatives):
int pos = __builtin_ctz(_mm256_movemask_ps(dst));
Note that the result of bsf
is unspecified if no bit is set. To work around this you can, e.g., write this to get 8
, if no other bit is set:
int pos = __builtin_ctz(_mm256_movemask_ps(dst) | 0x100);