17

I'm currently working on a C++ project which does numerical calculations. The vast, vast majority of the code uses single precision floating point values and works perfectly fine with that. Because of this I use compiler flags to make basic floating point literals single precision instead of the double precision, which is the default. I find that this makes expressions easier to read and I don't have to worry about forgetting a 'f' somewhere. However, every now and then I need the extra precision offered by double precision calculations and my question is how I can get a double precision literal into such an expression. Every way I've tried so far first store the value in a single precision variable and the converts the truncated value to a double precision value. Not what I want.

Some ways I've tried so far is given below.

#include <iostream>

int main()
{
  std::cout << sizeof(1.0E200) << std::endl;
  std::cout << 1.0E200 << std::endl;

  std::cout << sizeof(1.0E200L) << std::endl;
  std::cout << 1.0E200L << std::endl;

  std::cout << sizeof(double(1.0E200)) << std::endl;
  std::cout << double(1.0E200) << std::endl;

  std::cout << sizeof(static_cast<double>(1.0E200)) << std::endl;
  std::cout << static_cast<double>(1.0E200) << std::endl;

  return 0;
}

A run with single precision constants give the following results.

~/path$ g++ test.cpp -fsingle-precision-constant && ./a.out
test.cpp:6:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:7:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:12:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:13:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:15:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:16:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
4
inf
16
1e+200
8
inf
8
inf

It is my understanding that the 8 bytes provided by the last two cases should be enough to hold 1.0E200, a theory supported by the following output, where the same program is compiled without -fsingle-precision-constant.

~/path$ g++ test.cpp  && ./a.out
8
1e+200
16
1e+200
8
1e+200
8
1e+200

A possible workaround suggested by the above examples is to use quadruple precision floating point literals everywhere I originally intended to use double precision, and cast to double precision whenever required by libraries and such. However, this feels a bit wasteful.

What else can I do?

user1637052
  • 173
  • 1
  • 1
  • 5
  • Untried, but `strtod("1e+200")` just might be optimized to the double-precsion floating-point constant you desire. – Pascal Cuoq Aug 30 '12 at 20:44
  • 4
    I dunno, this sounds a bit like you're creating a problem for yourself. Why not leave it as-is and append `f` to everything that doesn't need double precision? – Mysticial Aug 30 '12 at 20:45
  • 1
    You can put your double constants in a separate file and compile it without the `-fsingle-precision-constant` flag. – Keith Randall Aug 30 '12 at 20:47
  • 2
    What is the point of enforcing single precision constants globally anyway? – Michał Górny Aug 30 '12 at 20:50
  • 2
    Well, you could append "L" to the floating point literal, which denotes a "long double" -- this is not the same if ony is pedantic, but it's identical to "double" on every compiler I know. – Damon Aug 30 '12 at 20:52
  • @Damon As you see in the example the size of long double is 16 bytes, while the size of double is 8 bytes. – Michał Wróbel Aug 30 '12 at 21:21
  • @OP - what's wasteful about going with the default (`double`) anyway? Do you have millions of hardcoded floating-point literals to squeeze into an embedded system or something? – Useless Aug 30 '12 at 21:24
  • Is there any reason why a compiler should regard `float f=0.1;` and `float f=0.1f;` any differently? There's no plausible action the programmer could have intended for the first which wouldn't match the second, and if the value is a #define, a habit of attaching a `f` suffix would likely lead to expressions like `double d=0.1f;`, which compilers will accept perfectly happily even though in most cases it's just plain wrong (in a code review, if the behavior of that expression was intentional, I would want it written as either `double d=(double)(float)0.1;` or `double d=(double)(float)0.1;`). – supercat Nov 19 '13 at 23:03

4 Answers4

19

Like Mark said, the standard says that its a double unless its followed by an f.

There are good reasons behind the standard and using compiler flags to get around it for convenience is bad practice.

So, the correct approach would be:

  1. Remove the compiler flag
  2. Fix all the warnings about loss of precision when storing double values in floating point variables (add in all the f suffixes)
  3. When you need double, omit the f suffix.

Its probably not the answer you were looking for, but it is the approach you should use if you care about the longevity of your code base.

Carl
  • 43,122
  • 10
  • 80
  • 104
  • What are those "good reasons" exactly? It looks more like a limitation which makes that particular flag hard to use. Also, what's wrong with using `double(1.0E200L)`? How does that impact the longevity? – Dmitry Grigoryev Nov 11 '15 at 07:09
  • A code base should not be limited to a specific compiler. A compiler should conform to a standard. A standard should be the one source of truth. It impacts the longevity because not conforming to the standard introduces inconsistencies over time as various programmers work on the code base and each does things in the way that they prefer. This is the reason standards exist. – Carl Nov 15 '15 at 20:17
11

If you read 2.13.3/1 you'll see:

The type of a floating literal is double unless explicitly specified by a suffix. The suffixes f and F specify float, the suffixes l and L specify long double.

In other words there is no suffix to specify double for a literal floating point constant if you change the default to float. Unfortunately you can't have the best of both worlds in this case.

Mark B
  • 95,107
  • 10
  • 109
  • 188
8

If you can afford GCC 4.7 or Clang 3.1, use a user-defined literal:

double operator "" _d(long double v) { return v; }

Usage:

std::cout << sizeof(1.0E200_d) << std::endl;
std::cout << 1.0E200_d << std::endl;

Result:

8
1e+200
Michał Wróbel
  • 684
  • 4
  • 10
5

You can't define your own suffix, but maybe a macro like

#define D(x) (double(x##L))

would work for you. The compiler ought to just emit a double constant, and appears to with -O2 on my system.

Geoff Reedy
  • 34,891
  • 3
  • 56
  • 79
  • +1 just for checking that the compiler emits a double constant, which I was too lazy to do when I commented on the question to say it would :-) – Steve Jessop Aug 30 '12 at 23:09