I am porting an application from Altivec to Neon.
I see a lot of intrinsics in Altivec which return scalar values.
Do we have any such intrinsics on ARM ?
For instance vec_all_gt
There are no intrinsics that give scalar comparison results. This is because the common pattern for SIMD comparisons is to use branchless lane-masking and conditional selects to multiplex results, not branch-based control flow.
You can build them if you need them though ...
// Do a comparison of e.g. two vectors of floats
uint32x4_t compare = vcgeq_f32(a, b)
// Shift all compares down to a single bit in the LSB of each lane, other bits zero
uint32x4_t tmp = vshrq_n_u32(a.m, 31);
// Shift compare results up so lane 0 = bit 0, lane 1 = bit 1, etc.
static const int shifta[4] { 0, 1, 2, 3 };
static const int32x4_t shift = vld1q_s32(shifta);
tmp = vshlq_u32(tmp, shift)
// Horizontal add across the vector to merge the result into a scalar
return vaddvq_u32();
... at which point you can define any()
(mask is non-zero) and all()
(mask is 0xF) comparisons if you need branchy logic.