Only _FloatN_t
types (e.g. _Float32_t
) are aliases from the <math.h>
header. All the other types are required to be distinct, and their names are keywords. (See H.5.1 [Keywords])
All of the types fall into one of four categories (see below). Choose between them as follows:
float
, double
, and long double
, if you are satisfied with the very lenient requirements of these types
- alternatively, check whether
__STDC_IEC_60559_BFP__
is defined, which makes them stricter
- also, use
float
and double
if you are okay with them being the same type1)
- also, you must use these types for compatibility with pre-C23 compilers
_FloatN
if you need a specific IEC 60559 type with exactly N bits
_FloatNx
if you need an extended IEC 60559 type with minimum N precision
- especially if you want to store N-bit integers in a floating-point number with no loss
_FloatN_t
if you don't need IEC 60559 types, and you are not satisfied with the minimum requirements for float
and double
1) On architectures without a double-precision FPU, float
and double
might be the same size (e.g. Arduino). Use other types (e.g. _Float64_t
over double
) if you want software emulation of double-precision instead.
Standard floating types
float
, double
, and long double
are collectively called standard floating types. Their representation is implementation-defined, but there are some requirements nonetheless:
double
must be able to represent any float
, and long double
must represent any double
- if
__STDC_IEC_60559_BFP__
is defined, float
and double
are represented like _Float32
and _Float64
- they must be able to represent some amount of decimal digits with no loss, and have a minimum/maximum value
Type |
Minimum Decimal Digits |
Minimum |
Maximum |
float |
FLT_DECIMAL_DIG ≥ 6 |
FLT_MIN ≤ 10-37 |
FLT_MAX ≥ 1037 |
double |
DBL_DECIMAL_DIG ≥ 10 |
DBL_MIN ≤ 10-37 |
DBL_MAX ≥ 1037 |
long double |
LDBL_DECIMAL_DIG ≥ 10 |
LDBL_MIN ≤ 10-37 |
LDBL_MAX ≥ 1037 |
Usually, float
and double
are binary32 and binary64 types respectively, and long double
is binary128, an x87 80-bit extended floating-point number, or represented same as double
.
See C23 Standard - E [Implementation limits]
Interchange floating types
_Float32
, _Float64
etc. are so called interchange floating types. Their representation must follow the IEC 60559 interchange format for binary floating-point numbers, such as binary32, binary64, etc. Any _FloatN
types must be exactly N bits wide.
The types _Float32
and _Float64
might not exist, unless the implementation defines __STDC_IEC_60559_BFP__
and __STDC_IEC_60559_TYPES__
. If so:
_Float32
exists, and float
has the same size and alignment as it (but is a distinct type)
_Float64
exists, and double
has the same size and alignment as it (but is a distinct type)
- a wider
_FloatN
(typically _Float128
) exists if long double
is a binaryN type with N > 64
See C23 Standard - H.2.1 [Interchange floating types].
Extended floating types
_Float32x
, _Float64x
, etc. are so called extended floating types (named after IEC 60559 extended precision). Unlike their interchange counterparts, they only have minimum requirements for their representation, not exact requirements. A _FloatNx
must have ≥ N bits of precision, making it able to represent N-bit integers with no loss.
These types might not exist, unless the implementation defines __STDC_IEC_60559_TYPES__
. If so:
_Float32x
exists if __STDC_IEC_60559_BFP__
is defined, and may have the same format as double
(but is a distinct type)
_Float64x
exists if __STDC_IEC_60559_DFP__
is defined, and may have the same format as long double
(but is a distinct type)
- in either case,
_Float128x
optionally exists
The extra precision and range often mitigate round-off error and eliminate overflow and underflow in intermediate computations.
See C23 Standard - H.2.3 [Extended floating types]
Aliases
_Float32_t
, _Float64_t
, etc. are aliases for other floating types, so that:
_FloatN_t
has at least the range and precision of the corresponding real floating type (e.g. _Float32_t
has the at least the range and precision of _Float32
if it exists)
- a wider type can represent all values of a narrower one (e.g.
_Float64_t
can represent _Float32_t
)
See C23 Standard - H.11 [Mathematics <math.h>].