Changing Mantissa's Width in Non-IEEE Floating Point implementation

Question

I have a gcc cross compiler on an 18 bit soft-core processor target that has the following datatypes defined: Integer 18 bit, Long 36 bit and float 36-bit(single precision). Right now my focus is on floating point operation. Since the width is non-standard(36 bit), I have the following scheme: 27 bits for Mantissa(significand), 8 bits for Exponents and 1 Sign bit. I can see the widths are defined in the float.h file. of interest to me are the following: FLT_MANT_DIG and FLT_DIG. They are defined as:

FLT_MANT_DIG 24 FLT_DIG 6

I have changed them to

FLT_MANT_DIG 28 FLT_DIG 9

As per my requirements in float.h and then build the gcc compiler. But still I get 32 bit floating point output.Do anyone has any experience implementing non-standard single precision floating point numbers and/or know the workaround?

`FLT_DIG 9` is incorrect. With 28 (27 explicit + 1 implied) bits of binary significand, use `FLT_DIG 8`. "⎣(p − 1) log10 b⎦" C11dr §5.2.4.2.2 11 — chux - Reinstate Monica, Sep 11 '14 at 23:32
Please elaborate "still I get 32 bit floating point output." — chux - Reinstate Monica, Sep 11 '14 at 23:33
BTW: If your FP is really non-standard, it might not use an implied MSBit, it which case use `FLT_MANT_DIG 27 FLT_DIG 7`. Do you have a reference for that 36-bit FP type? — chux - Reinstate Monica, Sep 11 '14 at 23:49
You have it backward. The `float.h` constants describe the floating point format that the rest of the compiler assumes so that _your_ programs can use this information. Changing `float.h` will have absolutely no effect on the compiler itself. — Gene, Sep 11 '14 at 23:51

score 1 · Answer 1 · answered Sep 11 '14 at 23:01

Efficient floating-point math requires hardware which is designed to support the exact floating-point formats which are being used. In the absence of such hardware, routines which are designed around a particular floating-point format will be much more efficient than routines which are readily adaptable to other formats. The GCC compiler and supplied libraries are designed to operate efficiently with IEEE-754 floating-point types and are not particularly adaptable to any others. The aforementioned headers exist not to allow a programmer to request a particular floating-point format, but merely to notify code about what format is going to be used.

If you don't need 72-bit floating-point types, and if the compiler's double type will perform 64-bit math in something resembling sensible fashion even though long is 36 bits rather than 32, you might be able to arrange things so that float values get unpacked into a four-word double, perform computations using that, and then rearrange the bits of the double to yield a float. Alternatively, you could write or find 36-bit floating-point libraries. I would not particularly expect GCC or its libraries to include such a thing, since 36-bit processors are rather rare these days.

Changing Mantissa's Width in Non-IEEE Floating Point implementation

1 Answers1