4

I have the below code

if(value  == 0)
{
    value = 1;
}

Using NEON vectorized instructions I need to perform the above. How do I compare a NEON register value with 0 for equality at a time 4 elements and change the value to 1 if the element is zero.

Dmitriy
  • 5,525
  • 12
  • 25
  • 38
ravi
  • 63
  • 1
  • 8
  • Also see [ARM NEON: comparing 128 bit values](http://stackoverflow.com/q/9068959). Also see [How to use NEON comparison (greater than or equal to) instruction?](http://stackoverflow.com/q/3788380) – jww Jul 28 '16 at 11:05

3 Answers3

4

If you want to check if any element of a vector is non-zero and branch on that:


You can use get min/max across vector lanes.

if(vmaxvq_u32(value) == 0) { // Max value across quad vector, equals zero?
    value = vmovq_n_u32(1); // Set all lanes to 1
}

For double vectors

if(vmaxv_u32(value) == 0) { // Max value across double vector, equals zero?
    value = vmov_n_u32(1); // Set all lanes to 1
}

Notice the only difference is the 'q' which is used to indicate quad 128-bit vector or 64-bit double vector if not. The compiler will use a mov instruction to transfer from a neon single to arm generic register to do the comparison.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • This question is not about checking if *all* elements are zero, it's really just trivial how-to-use-`vceq` to get a mask of 0 / -1. This would be an answer to [Neon 64 bit aarch: compare vector to zero](//stackoverflow.com/q/48016909). I'm not sure if there's a 32-bit ARM version of that question anywhere. It's somewhat useful here, if people happen to find this question when actually looking for what you answered (scalar branch if *any* elements of a vector match a condition). – Peter Cordes Apr 08 '19 at 18:36
3

Assuming integer data, then thanks to NEON having specific "compare against zero" instructions, and the bitwise way comparison results work, there's a really cheeky way to do this using just one spare register. In generalised pseudo-assembly:

VCEQ.type  mask, data, #0    @ Generate bitmask vector with all bits set in elements
                             @  corresponding to zero elements in the data
VSUB.type  data, data, mask  @ Interpret "mask" as a vector of 0s and -1s, with the
                             @  result of incrementing just the zero elements of "data"
                             @  (thanks to twos complement underflow)

This trick doesn't work for floating-point data as the bit-patterns for nonzero values are more complicated, and neither does it work if the replacement value is to be anything other than 1 (or -1), so in those cases you would need to construct a separate vector containing the appropriate replacement elements and do a conditional select using the comparison mask as per @Ermlg's answer.

Community
  • 1
  • 1
Notlikethat
  • 20,095
  • 3
  • 40
  • 77
  • Since the bitwise representation of `(float)0` is the same as `(int)0` in ieee754-floats, you could just bitwise-and the mask with the floating-point representation of `1`(or whatever other value you want to set the zero-element to) and then add instead of subtracting. – EOF Dec 31 '15 at 16:01
  • @EOF ...but once you have a vector of replacement values to hand, bitwise-and then add is two instructions vs. just one for the direct conditional select with that vector with the result mask ;) – Notlikethat Dec 31 '15 at 16:40
2

Maybe it will look something like this:

uint32x4_t value = {7, 0, 0, 3};
uint32x4_t zero = {0, 0, 0, 0};
uint32x4_t one = {1, 1, 1, 1};

uint32x4_t mask = vceqq_u32(value, zero);

value = vbslq_u32(mask, one, value);

To get more information see here.

ErmIg
  • 3,980
  • 1
  • 27
  • 40