Vectorized floating point rounding using NEON

Question

I've got a NEON register filled with float32. I'd like to round them to the nearest integer without having to transfer back to the main CPU. The NEON instructions to convert float32 to uint32 simply truncate, so e.g. 39.7 becomes 39, not 40. I don't care much about how 0.5 gets handled -- round away from zero or round to even both work for me.

The best path I can see to implement rounding is to

convert to int32 (thus truncating)
convert back to float32
add 1 to the int32, convert back to float32, and set aside in case we're rounding up
subtract
compare to 0.5 (no need for abs value since I know in my case they'll all be positive)
select truncated or truncated + 1 based on the comparison outcome

That seems ugly, slow, and complicated.

Is there a cleaner, faster, simpler, saner way?

Eric Postpischil · Accepted Answer · 2020-10-15T00:03:22.260

7

Add .5 and convert to integer. If you want the result in floating-point format, convert back.

Since you know the numbers are all positive, another option is to add 0x1p23 and subtract 0x1p23. The result of adding 0x1p23 is at least 0x1p23, so the float result has no bits with value less than one, so it must have been rounded to an integer. Then subtracting 0x1p23 subtracts the value that was added, leaving only the effect of rounding.

Update: This second method fails if the input is in [0x1p47, 0x1p48) and its low bit is one. Then 0x1p23 is half the ULP of the input, so the addition causes rounding upward (to even), and the subtraction has no effect. I think there is a modification to fix that, but I do not have it at hand.

edited Oct 15 '20 at 00:03

answered May 26 '12 at 01:12

Eric Postpischil

195,579
13
168
312

Hahahaha. I am an idiot. :) And thanks for the 0x1p23 tip, that's interesting. – Josh Bleecher Snyder May 26 '12 at 06:21
Actually, 0x1p23 results in an incorrect result if the input is in [0x1p47, 0x1p48) and its low bit is one. Then 0x1p23 is half the ULP of the input, so the addition causes rounding upward (to even), and the subtraction has no effect. I think there is a modification to fix that, but I do not have it at hand. – Eric Postpischil May 26 '12 at 19:00
I think the value that is to be added and subtracted is 0x1.8p+23 – kanna Mar 14 '14 at 16:52
@kanna: 0x1.8p+23 does not work with x = 0x1p22 + 1 (4,194,305). Then x + 0x1.8p+23 would be 0x2p24 + 1 (16,777,217) with real arithmetic, but that is not representable, so 0x2p24 (16,777,216) is produced. Then subtracting 0x1.8p23 gives 0x1p22 (4,194,304). – Eric Postpischil Oct 15 '20 at 00:01

score 0 · Answer 2 · answered Aug 16 '18 at 10:00

Float to int round needs to add or subtract 0.5 depending on the positive or negtive, we knew. In Neon, 1. I can extract the signed of value; 2. bit-or with 0.5, then 0.5 has the sign; 3. add signed 0.5 with origin value:

// 1. extract sign of origin value
int32x4_t reinterpretInt = vreinterpretq_s32_f32(inputFloat);
int32x4_t signExtract = vdupq_n_s32(-2147483648);
int32x4_t signSignal = vandq_s32(reinterpretInt, signExtract);

// 2. bit-or with 0.5 with origin value
float32x4_t roundValue = vdupq_n_f32(0.5);
float32x4_t plusValue = vreinterpretq_f32_s32(vorrq_s32(vreinterpretq_s32_f32(roundValue), signSignal));

// 3. add signed 
return vaddq_f32(inputFloat, plusValue);

Vectorized floating point rounding using NEON

2 Answers2