2

I am currently learning Rust because I wanted to use it in project that requires a very high performance. I initially fallen in love with enums but then I started to evaluate their performance and I have found something that is really boggling me. Here is an example:

use std::time::{Instant};

pub enum MyEnum<'a> {
    V1,
    V2(&'a MyEnum<'a>),
    V3,
}

impl MyEnum<'_> {
    pub fn eval(&self) -> i64 {
        match self {
            MyEnum::V1 => 1,
            MyEnum::V2(_) => 2,
            MyEnum::V3 => 3,
        }
    }
    pub fn eval2(&self) -> i64 {
        match self {
            MyEnum::V1 => 1,
            MyEnum::V2(a) => a.eval2(),
            MyEnum::V3 => 3,
        }
    }
}


fn main() {
    const EXAMPLES: usize = 10000000;
    let en = MyEnum::V1{};

    let start = Instant::now();
    let mut sum = 0;
    for _ in 0..EXAMPLES {
        sum += en.eval()
    }
    println!("enum without fields func call sum: {} at {:?}", sum, start.elapsed());

    let start = Instant::now();
    let mut sum = 0;
    for _ in 0..EXAMPLES {
        sum += en.eval2()
    }
    println!("enum with field func call sum: {} at {:?}", sum, start.elapsed());
}

Results I get:

enum without fields func call sum: 10000000 at 100ns
enum with field func call sum: 10000000 at 6.3425ms

eval function should execute exactly the same instructions as eval2 for V1 enum but it's working about 60x slower. Why is this happening?

  • 1
    Welcome to Stack Overflow! It's hard to answer your question because it doesn't include a [MRE]. We can't tell **exactly how you are running this code**. Please [edit] your question to include the additional info. Thanks! – Shepmaster May 20 '20 at 20:59
  • Specifically, I'm guessing this is a duplicate of [Why is my Rust program slower than the equivalent Java program?](https://stackoverflow.com/q/25255736/155423) – Shepmaster May 20 '20 at 21:00
  • It may also be a duplicate of [What's the difference between var and _var in Rust?](https://stackoverflow.com/q/47664704/155423) – Shepmaster May 20 '20 at 21:02
  • 1
    Also, your second function performs a recursive call, which is completely different from the first function. Why do you believe that these should be the same at all? – Shepmaster May 20 '20 at 21:03
  • @Shepmaster In both cases the object is `MyEnum::V1`. I believe the OP expects that the `MyEnum::V2` case should have no bearing on performance. – Schwern May 20 '20 at 21:11
  • 3
    The first function most likely optimizes to `return discriminant + 1;`. The second function needs a branch. – mcarton May 20 '20 at 22:43
  • 4
    Furthermore, in the first function, the compiler inlines the function call and is smart enough to transform the loop to a single `sum += EXAMPLES`. This benchmark is broken. – mcarton May 20 '20 at 22:47
  • 1
    tip: 10 million loop iterations in 100ns should tell you that something isn't right with the benchmark given that a single cpu cycle is around 1/3 of a nanosecond. – Paul Hankin May 21 '20 at 09:38

2 Answers2

6

Viewing the assembly, your first loop is optimized entirely into a single mov 10000000 instruction (that is, the compiler does something equivalent to sum += EXAMPLES) while the second is not. I do not know why the second loop is not constant-optimized as heavily.

mcarton
  • 27,633
  • 5
  • 85
  • 95
kmdreko
  • 42,554
  • 6
  • 57
  • 106
2

I see no difference in performance, as one would expect.

$ ./test
enum without fields func call sum: 10000000 at 307.543596ms
enum with field func call sum: 10000000 at 312.196195ms

$ rustc --version
rustc 1.43.1 (8d69840ab 2020-05-04)

$ uname -a
Darwin Windhund.local 18.7.0 Darwin Kernel Version 18.7.0: Mon Feb 10 21:08:45 PST 2020; root:xnu-4903.278.28~1/RELEASE_X86_64 x86_64 i386 MacBookPro15,2 Darwin

One problem might be the use of simple "wall clock" time for benchmarking. This simple count of how much time passed is vulnerable to anything else running which might consume resources. Anti-virus, a web browser, any program. Instead, use benchmark tests.

Schwern
  • 153,029
  • 25
  • 195
  • 336