I've got a NEON register filled with float32
. I'd like to round them to the nearest integer without having to transfer back to the main CPU. The NEON instructions to convert float32
to uint32
simply truncate, so e.g. 39.7
becomes 39
, not 40
. I don't care much about how 0.5
gets handled -- round away from zero or round to even both work for me.
The best path I can see to implement rounding is to
- convert to
int32
(thus truncating) - convert back to
float32
- add 1 to the
int32
, convert back tofloat32
, and set aside in case we're rounding up - subtract
- compare to
0.5
(no need for abs value since I know in my case they'll all be positive) - select truncated or truncated + 1 based on the comparison outcome
That seems ugly, slow, and complicated.
Is there a cleaner, faster, simpler, saner way?