0

I have a C++ program that can be compiled for single or double precision floating point numbers. Similar as explained here (Switching between float and double precision at compile time), I have a header file which defines:

typedef double dtype

or:

typedef float dtype

depending on whether single or double precision is required by the user. When declaring variables and arrays I always use the data type dtype, so the correct precision is used throughout the code.

My question is how can I, in a similar fashion, set the data type of hard-coded numbers in the code, like for instance in this example:

dtype var1 = min(var0, 3.65)

As far as I know, 3.65 is by default double precision and will be single precision if I write:

dtype var1 = min(var0, 3.65f)

But is there a way to define a literal, for instance like this:

dtype var1 = min(var0, 3.65_dt)

that can either be defined as float or double at compile time to ensure that also hard-coded numbers in the code will have the right precision?

Currently, I cast the number to dtype like this:

dtype var1 = min(var0, (dtype)3.65)

but I was concerned that this might create overhead in the case of single precision since the program might actually create a double precision number which is then cast to a single precision number. Is this indeed the case?

  • `constexpr dtype x = 3.65;` `x` will be calculated at compile time. But I would expect that to be the case with your code as well. – john Mar 03 '23 at 20:14
  • Macros really shouldn't be entering the scene at all. You should be able to express all of this with fairly straight forward templates – Brian61354270 Mar 03 '23 at 20:16
  • 1
    Concerning the literal: https://en.cppreference.com/w/cpp/language/user_literal – joergbrech Mar 03 '23 at 20:35

1 Answers1

1

You can do this with a macro that appends an f suffix for float, as with #define foo(x) x##f, and does not for double, as with #define foo(x) x.

While you can also coerce constants to become float values with casts or various induced conversions, this creates a double-rounding process: The literal in source text is first converted to double and then converted to float. In about one instance in 229, this produces a different result than if the literal is directly converted to float.

(229 is due to the difference in the numbers of bits in the significands of the formats commonly used for float and double, 24 and 53. This assumes a uniform distribution for the bit patterns in the representation. Practical data may have a different distribution.)

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312