How to get the lowest representable floating point value in C++

Question

I have a program where I need to set a variable to the lowest representable (non infinite) double-precision floating point number in C++. How am I able to set a variable to the lowest double-precision floating point value?

I tried using std::numeric_limits. I am not using C++11 so I am unable to try to use the lowest() function. I tried to use max(), but when I tried it, it returned infinity. I also tried subtracting a value from max() in the hope that I would then get a representable number.

double max_value = std::numeric_limits<double>::max();
cout << "Test 1: " << max_value << endl;    
max_value = max_value - 1;
cout << "Test 2: " << max_value << endl;
double low_value = - std::numeric_limits<double>::max();
cout << "Test 3: " << low_value << endl;
cout << "Test 4: " << low_value + 1 << endl;

Output:

Test 1: inf
Test 2: inf
Test 3: -inf
Test 4: -inf

How can I set low_value in the example above to the lowest representable double?

18.3.2.4 of the (C++11) standard says that max() is finite. So I guess the double or the ostream implementation is buggy. Or you can try to fiddle with compiler options like "exact float computations" equivalent. — Peter - Reinstate Monica, Jul 10 '14 at 18:46
@PeterSchneider OP: "... I am not using C++11 ..." ........... — Theolodis, Jul 10 '14 at 18:52
Just out of curiosity, what would `cout << std::numeric_limits::max()` print? The gcc wiki has some information about issues arising from 80 bit precision hardware vs. memory layout of doubles but I'm not sure it really applies here: https://gcc.gnu.org/wiki/FloatingPointMath — Peter - Reinstate Monica, Jul 10 '14 at 18:53
@Theolodis it's the one I have here; I cannot imagine that `max()`s definition has changed between standards. It's not `max_plus_epsilon()` after all ;-) — Peter - Reinstate Monica, Jul 10 '14 at 18:54
'cout << std::numeric_limits::max()' printed infinity also. There must be something wrong with my iostream. — Zachary Miller, Jul 10 '14 at 19:02
You could assign it to long double and print then! -- Ah, just read your answer. Nice. -- or not. — Peter - Reinstate Monica, Jul 10 '14 at 19:23

vinc17 · Accepted Answer · 2019-07-30T15:16:46.580

Once you have -inf (you got it), you can get the lowest finite value with the nextafter function on (-inf,0).

EDIT: Depending on the context, this may be better than -DBL_MAX in case DBL_MAX is represented in decimal (thus in an inexact way). However the C standard requires that floating constants be evaluated in the default rounding mode (i.e. to nearest). In the particular case of GCC, DBL_MAX is a long double value cast to double; however the long double value seems to have enough digits so that, once converted from decimal to long double, the value is exactly representable in as double, so that the cast is exact and the active rounding mode doesn't affect it. As you can see, this is rather tricky, and one may want to check on various platforms that it is correct under any context. In a similar way, I have serious doubts on the correctness of the definition DBL_EPSILON by GCC on PowerPC (where the long double type is implemented as a double-double arithmetic) since there are many long double values extremely close to a power of two.

score 0 · Answer 2 · answered Jul 10 '14 at 19:00

0

The standard library <cfloat>/<float.h> provides macros defining floating point implementation parameters.

The question is somewhat ambiguous - it is not clear whether you mean the smallest magnitude representable non-zero value (which would be DBL_MIN) or the lowest representable value (given by -DBL_MAX). Either way - take your pick as necessary.

answered Jul 10 '14 at 19:00

Clifford

88,407
13
85
165

In general (e.g. on IEEE 754 systems), `DBL_MIN` is not the number that has the smallest magnitude, but the smallest positive _normal_ number. The positive number with the smallest magnitude is `DBL_TRUE_MIN` (well, in C11, I'm not sure about C++), which is different from `DBL_MIN` when you have _subnormals_. – vinc17 Jul 11 '14 at 01:55
@vinc17 : It seems that in any case -DBL_MAX is what is required in any case. – Clifford Jul 11 '14 at 08:09
Yes, however there may be implementation issues with `DBL_MAX`: under some conditions, it may yield an incorrect value. – vinc17 Jul 11 '14 at 08:39
@vinc17 : As you say in your answer, depending on the purpose that may not matter - floating point is not simple to get right in very subtle ways in many cases - up-voted your answer. – Clifford Jul 11 '14 at 09:55
I've edited my answer again because the definition of GCC actually appears to be correct, though it is rather fragile (for instance, I'm not sure that it is correct with any compile flag, and it may easily break in the future). – vinc17 Jul 11 '14 at 11:12

score 0 · Answer 3 · edited Jul 30 '19 at 18:17

0

It turned out that there was a bug in the iostream that I was using to print the values. I switched to using cstdio instead of iostream. The values were then printed as expected.

double low_value = - std::numeric_limits<double>::max();
cout <<"cout: " << low_value << endl;
printf("printf: %f\n",low_value);

Output:

cout: inf
printf: 179769...

edited Jul 30 '19 at 18:17

genpfault

51,148
11
85
139

answered Jul 10 '14 at 19:00

Zachary Miller

179
1
9

Oh. Printing the double as int doesn't mean much (what's the inf bit pattern of a double? 179...?) – Peter - Reinstate Monica Jul 10 '14 at 19:25
Use `%f` to print a double – M.M Jul 11 '14 at 02:05
Sorry, that was I typo on my part. I had used the %f in my code but wrote %d out of habit here without noticing. – Zachary Miller Jul 11 '14 at 10:50

How to get the lowest representable floating point value in C++

3 Answers3