1

What's a reasonable strategy for preserving the accuracy of a floating point literal in Rust when using generics? As an example, consider the following code:

// External functions
use std::fmt::LowerExp;

// Some random struct parameterized on a float type
struct Foo<Float> {
    x: Float,
    y: Float,
}

// Some random, generic function
fn bar<Float>(x: Float)
where
    Float: LowerExp + From<f32>,
{
    let foo = Foo::<Float> { x, y: 0.45.into() };
    println!("x = {:.16e} y = {:.16e}", foo.x, foo.y);
}

fn main() {
    bar::<f32>(0.45);
    bar::<f64>(0.45);
}

Here, we get the output:

x = 4.4999998807907104e-1 y = 4.4999998807907104e-1
x = 4.5000000000000001e-1 y = 4.4999998807907104e-1

This makes sense. The first 0.45 was parsed knowing whether or not we were an f32 or an f64. However, the second was parsed as an f32 in both cases because of the trait From<f32>. Now, I'd like the output to be

x = 4.4999998807907104e-1 y = 4.4999998807907104e-1
x = 4.5000000000000001e-1 y = 4.5000000000000001e-1

where y is parsed and set using the precision of the generic Float. One attempt to do that is by adding an additional trait of From<f64>. However, this is not satisfied by f32, so we get a compiler error. If we remove that use case and run the code:

// External functions
use std::fmt::LowerExp;

// Some random struct parameterized on a float type
struct Foo<Float> {
    x: Float,
    y: Float,
}

// Some random, generic function
fn bar<Float>(x: Float)
where
    //Float: LowerExp + From<f32>,
    Float: LowerExp + From<f32> + From<f64>,
{
    let foo = Foo::<Float> { x, y: 0.45.into() };
    println!("x = {:.16e} y = {:.16e}", foo.x, foo.y);
}

fn main() {
    //bar::<f32> (0.45);
    bar::<f64>(0.45);
}

we get what we want:

x = 4.5000000000000001e-1 y = 4.5000000000000001e-1

at the expense of no longer working for f32. Alternatively, it's been suggested that we can use the num crate. Unfortunately, unless I'm missing something, this appears to run into an issue with double rounding. The program:

// External functions
use num::FromPrimitive;
use std::fmt::LowerExp;

// Some random struct parameterized on a float type
struct Foo<Float> {
    x: Float,
    y: Float,
}

// Some random, generic function
fn bar<Float>(x: Float)
where
    Float: LowerExp + FromPrimitive,
{
    let foo = Foo::<Float> {
        x,
        y: <Float as FromPrimitive>::from_f64(
            0.5000000894069671353303618843710864894092082977294921875,
        )
        .unwrap(),
    };
    println!("x = {:.16e} y = {:.16e}", foo.x, foo.y);
}

fn main() {
    bar::<f32>(0.5000000894069671353303618843710864894092082977294921875f32);
    bar::<f64>(0.5000000894069671353303618843710864894092082977294921875f64);
}

Produces

x = 5.0000005960464478e-1 y = 5.0000011920928955e-1
x = 5.0000008940696716e-1 y = 5.0000008940696716e-1

This is problematic because x and y differ in the f32 case.

Anyway, is there a reasonable strategy for insuring the floating point literal is converted the same in both cases?

Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
wyer33
  • 6,060
  • 4
  • 23
  • 53
  • Your question might be answered by the answers of [How do I use floating point number literals when using generic types?](https://stackoverflow.com/q/50767912/155423). If not, please **[edit]** your question to explain the differences. Otherwise, we can mark this question as already answered. – Shepmaster Oct 09 '19 at 18:33
  • @Shepmaster Unless I'm missing something, I think the num crate doesn't deal with issues caused by double rounding. I updated the questions with some code that demonstrates this. – wyer33 Oct 09 '19 at 18:56

0 Answers0