Convert uint32 floating point representation to uint8

Question

Without going into too many details, I have two embedded systems, neither of which can use the floating point library. In between them is a mobile application, where some calculations are performed. One calculation needs to keep the precision. This value is sent to the client via Bluetooth, as a byte array. When it's received I am then storing it as a uint32. This value is then shared again with the mobile application, where the precision is needed. There is no problem there because I am able to use Java's ByteBuffer class. The problem is I also need to share this value with a Z80 microprocessor as an uint8 because it is transmitted over a UART, and is used as an ADC count (0-255), so it loses the precision (but it's not needed on this end).

So I am doing this in my mobile app using Java:

int bits = Float.floatToIntBits(fAdcCount);
byte[] b = new byte[4];
b[0] = (byte)(bits & 0xff);
b[1] = (byte)((bits >> 8) & 0xff);
b[2] = (byte)((bits >> 16) & 0xff);
b[3] = (byte)((bits >> 24) & 0xff);

b is then being sent in a BLE characteristic write to the BLE microcontroller. The BLE microcontroller then reads this buffer as a little-endian 32-bit word and stores it in a uint32. Looking at this uint32 in the debugger shows the correct values, and like I said above I am able to put this uint32 back into a byte array and send it to the mobile app and read it using Java's ByteBuffer class. This works fine, I get the correct floating point value.

The problem is I need the integer part of this floating point representation to send to a Z80 microprocessor over UART as a uint8 because it is used as an ADC count from 0-255.

So, how can I convert a floating point value that's been wrapped up into an uint32 (little endian byte order) to an uint8 losing the precision? I also know that the range is from 0-255, meaning there will never be anything greater than 255.0 to convert.

For example if I have this:

uint32 fValue = 0x43246ADD; // = 164.417...

How can I get this without using float?:

uint8 result = 164;

You would have to know the specific floating point encoding format, which is not mandated by C. Here is [one such format](https://en.wikipedia.org/wiki/Single-precision_floating-point_format). And the integral part may very likely not fit 8 bits anyway. — Weather Vane, Nov 04 '16 at 18:51
Can you explain why you have a floating-point value stored in a `uint32`? And why you're using `uint32` rather than the standard `uint32_t`, defined in ``? — Keith Thompson, Nov 04 '16 at 19:17
@KeithThompson Sure, I am working with two embedded systems, one being a Z80 architecture that needs the result as a unit8 (0-255), and the other which is storing the data as uint32, this cannot be changed. I ran into an issue where I needed floating point precision from a calculation and needed to store it, but was unable to transmit it as a float, so that's why it's been wrapped up into a uint32. However, when sharing this data back with the Z80 microprocessor, it cannot use the floating point library or a uint32, it must be sent as a uint8 (0-255). — DigitalNinja, Nov 04 '16 at 19:33
And how is the floating-point value represented in a `uint32`? Which bits represent the exponent, mantissa, and exponent, and how? — Keith Thompson, Nov 04 '16 at 19:34
I need to update my question before I waste more people's time. — DigitalNinja, Nov 04 '16 at 20:01
Note that `float` to `uint8_t` is well defined over a slightly wider range than `[0 to 255.0]`. Instead `[-0.9999... to +255.999...]` — chux - Reinstate Monica, Nov 04 '16 at 21:15
If code is starting from a 32-bit unsigned type, endian is no concern. If code is starting from a 4-byte array, then endian issues apply. Endian of `float` is _typically_ in the same endian as `uint32_t`, but not specified to be so. Of course the whole format of `float` is not C specific. Best to state the FP format like [binary32](https://en.wikipedia.org/wiki/Single-precision_floating-point_format) — chux - Reinstate Monica, Nov 04 '16 at 21:21
perhaps you could use: float fvalue; int ivalue = (int)floor(fvalue); — user3629249, Nov 06 '16 at 01:09

chux - Reinstate Monica · Accepted Answer · 2016-11-04T22:32:55.630

Some lightly tested code to get OP started. Uses lots of constants to allow customization and various degrees of error checking. OP has not addressed rounding: round to nearest, toward 0, or ??. Following truncates toward 0.

#include <stdint.h>
#define MANTISSA_BIT_WIDTH 23
#define BIASED_EXPO_MAX 255
#define EXPO_BIAS 127
#define SIGN_MASK 0x80000000

unsigned DN_float_to_uint8(uint32_t x) {
  if (x & SIGN_MASK) return 0; // negative
  int expo = (int) (x >> MANTISSA_BIT_WIDTH);
  if (expo == 0) return 0;  // sub-normal
  if (expo == BIASED_EXPO_MAX) return 255;  // Infinity, NaN
  expo -= EXPO_BIAS;
  if (expo > 7) return 255; // too big
  if (expo < 0) return 0; // wee number
  uint32_t mask = ((uint32_t)1 << MANTISSA_BIT_WIDTH) - 1;
  uint32_t mantissa = mask & x;
  mantissa |= mask + 1;
  mantissa >>= (MANTISSA_BIT_WIDTH - expo);
  return mantissa;
}

#include <stdio.h>
int main() {
  printf("%u\n", DN_float_to_uint8(0x43246a00)); // 164
  printf("%u\n", DN_float_to_uint8(0x437e0000)); // 254
  printf("%u\n", DN_float_to_uint8(0x437f0000)); // 255
  printf("%u\n", DN_float_to_uint8(0x43800000)); // 256
  printf("%u\n", DN_float_to_uint8(0x3f7fffff)); // 0.99999994
  printf("%u\n", DN_float_to_uint8(0x3f800000)); // 1
  printf("%u\n", DN_float_to_uint8(0x40000000)); // 2
  return 0;
}

Output

Useful IEEE 754 Converter

To round positive values to nearest (many ways to cope), ties away from zero, just look at the last bit to be shifted out.

  // mantissa >>= (MANTISSA_BIT_WIDTH - expo);
  // return mantissa;

  // shift as needed expect for 1 bit
  mantissa >>= (MANTISSA_BIT_WIDTH - expo - 1);
  // now shift last bit
  mantissa = (mantissa + 1) >> 1;
  // Handle special case
  if (mantissa >= 256) mantissa = 255;
  return mantissa;

Thanks! You bring up a good point, and at first I thought it doesn't really matter, but on second thought I would like for it to round to the nearest whole integer. I have tried figuring out how to modify your code to do that but my brain feels like mush and I'm not seeing it. — DigitalNinja, Nov 04 '16 at 22:15
@DigitalNinja "round to the nearest whole integer" and what about when equidistant? like 3.5 or 4.5 or -0.5? — chux - Reinstate Monica, Nov 04 '16 at 22:18
Rounding up on equidistant values would work out fine, but I'm not super concerned with which direction it goes if one way is easier than the other. — DigitalNinja, Nov 04 '16 at 22:22
Dang, I was close with my modifications. Thanks again! You're a smart dude ;) — DigitalNinja, Nov 04 '16 at 22:33

score 2 · Answer 2 · answered Nov 04 '16 at 21:00

So you want to extract the integer part of an IEEE 754 single precision float, with no floating point operations available.

All you need is a few bitwise operations to extract the exponent and mantissa. Lightly tested code. This code supports signed values and returns a value between -255 and 255, or INT_MIN if the absolute value is outside the permitted range; adjust your support for sign, behavior in case of overflow, etc. as needed.

int integer_part_of_single_float(uint32_t f)
{
    uint32_t mantissa = (f & 0x7fffff) | 0x800000;
    uint8_t exponent = f >> 23;
    if (exponent < 127) {
        return 0;
    } else if (exponent >= 135) {
        return INT_MIN;
    } else {
        unsigned absolute_value = mantissa >> (22 - (exponent - 128));
        return mantissa & 0x80000000 ? -absolute_value : absolute_value;
    }
}

Paolo · Answer 3 · 2016-11-04T19:54:21.063

Assuming the float stored inside the uint32_t has the same representation format of float type on your architecture and your floats are 32 bit wide sizeof(float)==4 (this is quite a standard) you can do something like that:

float *f;

f = (float *)&fValue;

if( *f <= 255 && *f >= 0 )
{
    result = (uint8_t) (*f);
}
else
{
    // overflow / undeflow
}

You declare a float pointer and point it to the location of the uint32_t.

Then take the value pointed py the float pointer and attempt to cast to uint8_t

I tried the code and can tell the assumptions above are true on macOS.

For example:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    uint32_t fValue = 0x43246ADD;
    uint8_t result=0;

    float *f;

    f = (float *)&fValue;

    if( *f <= 255 && *f >= 0 )
    {
        result = (uint8_t) (*f);
    }
    else
    {
        // overflow / undeflow
    }

    printf("%d\n", result);
}

Outputs:

I really appreciate your time and effort, and I'm sure this would work, but I failed to realize that I cannot use floating point types on either side. — DigitalNinja, Nov 04 '16 at 19:59

Convert uint32 floating point representation to uint8

3 Answers3