What's a reasonable strategy for preserving the accuracy of a floating point literal in Rust when using generics? As an example, consider the following code:
// External functions
use std::fmt::LowerExp;
// Some random struct parameterized on a float type
struct Foo<Float> {
x: Float,
y: Float,
}
// Some random, generic function
fn bar<Float>(x: Float)
where
Float: LowerExp + From<f32>,
{
let foo = Foo::<Float> { x, y: 0.45.into() };
println!("x = {:.16e} y = {:.16e}", foo.x, foo.y);
}
fn main() {
bar::<f32>(0.45);
bar::<f64>(0.45);
}
Here, we get the output:
x = 4.4999998807907104e-1 y = 4.4999998807907104e-1
x = 4.5000000000000001e-1 y = 4.4999998807907104e-1
This makes sense. The first 0.45
was parsed knowing whether or not we were an f32
or an f64
. However, the second was parsed as an f32
in both cases because of the trait From<f32>
. Now, I'd like the output to be
x = 4.4999998807907104e-1 y = 4.4999998807907104e-1
x = 4.5000000000000001e-1 y = 4.5000000000000001e-1
where y
is parsed and set using the precision of the generic Float
. One attempt to do that is by adding an additional trait of From<f64>
. However, this is not satisfied by f32
, so we get a compiler error. If we remove that use case and run the code:
// External functions
use std::fmt::LowerExp;
// Some random struct parameterized on a float type
struct Foo<Float> {
x: Float,
y: Float,
}
// Some random, generic function
fn bar<Float>(x: Float)
where
//Float: LowerExp + From<f32>,
Float: LowerExp + From<f32> + From<f64>,
{
let foo = Foo::<Float> { x, y: 0.45.into() };
println!("x = {:.16e} y = {:.16e}", foo.x, foo.y);
}
fn main() {
//bar::<f32> (0.45);
bar::<f64>(0.45);
}
we get what we want:
x = 4.5000000000000001e-1 y = 4.5000000000000001e-1
at the expense of no longer working for f32
. Alternatively, it's been suggested that we can use the num
crate. Unfortunately, unless I'm missing something, this appears to run into an issue with double rounding. The program:
// External functions
use num::FromPrimitive;
use std::fmt::LowerExp;
// Some random struct parameterized on a float type
struct Foo<Float> {
x: Float,
y: Float,
}
// Some random, generic function
fn bar<Float>(x: Float)
where
Float: LowerExp + FromPrimitive,
{
let foo = Foo::<Float> {
x,
y: <Float as FromPrimitive>::from_f64(
0.5000000894069671353303618843710864894092082977294921875,
)
.unwrap(),
};
println!("x = {:.16e} y = {:.16e}", foo.x, foo.y);
}
fn main() {
bar::<f32>(0.5000000894069671353303618843710864894092082977294921875f32);
bar::<f64>(0.5000000894069671353303618843710864894092082977294921875f64);
}
Produces
x = 5.0000005960464478e-1 y = 5.0000011920928955e-1
x = 5.0000008940696716e-1 y = 5.0000008940696716e-1
This is problematic because x
and y
differ in the f32
case.
Anyway, is there a reasonable strategy for insuring the floating point literal is converted the same in both cases?