Why enum value binding in Rust is so slow?

Question

I am currently learning Rust because I wanted to use it in project that requires a very high performance. I initially fallen in love with enums but then I started to evaluate their performance and I have found something that is really boggling me. Here is an example:

use std::time::{Instant};

pub enum MyEnum<'a> {
    V1,
    V2(&'a MyEnum<'a>),
    V3,
}

impl MyEnum<'_> {
    pub fn eval(&self) -> i64 {
        match self {
            MyEnum::V1 => 1,
            MyEnum::V2(_) => 2,
            MyEnum::V3 => 3,
        }
    }
    pub fn eval2(&self) -> i64 {
        match self {
            MyEnum::V1 => 1,
            MyEnum::V2(a) => a.eval2(),
            MyEnum::V3 => 3,
        }
    }
}


fn main() {
    const EXAMPLES: usize = 10000000;
    let en = MyEnum::V1{};

    let start = Instant::now();
    let mut sum = 0;
    for _ in 0..EXAMPLES {
        sum += en.eval()
    }
    println!("enum without fields func call sum: {} at {:?}", sum, start.elapsed());

    let start = Instant::now();
    let mut sum = 0;
    for _ in 0..EXAMPLES {
        sum += en.eval2()
    }
    println!("enum with field func call sum: {} at {:?}", sum, start.elapsed());
}

Results I get:

enum without fields func call sum: 10000000 at 100ns
enum with field func call sum: 10000000 at 6.3425ms

eval function should execute exactly the same instructions as eval2 for V1 enum but it's working about 60x slower. Why is this happening?

Welcome to Stack Overflow! It's hard to answer your question because it doesn't include a [MRE]. We can't tell **exactly how you are running this code**. Please [edit] your question to include the additional info. Thanks! — Shepmaster, May 20 '20 at 20:59
Specifically, I'm guessing this is a duplicate of [Why is my Rust program slower than the equivalent Java program?](https://stackoverflow.com/q/25255736/155423) — Shepmaster, May 20 '20 at 21:00
It may also be a duplicate of [What's the difference between var and _var in Rust?](https://stackoverflow.com/q/47664704/155423) — Shepmaster, May 20 '20 at 21:02
Also, your second function performs a recursive call, which is completely different from the first function. Why do you believe that these should be the same at all? — Shepmaster, May 20 '20 at 21:03
@Shepmaster In both cases the object is `MyEnum::V1`. I believe the OP expects that the `MyEnum::V2` case should have no bearing on performance. — Schwern, May 20 '20 at 21:11
The first function most likely optimizes to `return discriminant + 1;`. The second function needs a branch. — mcarton, May 20 '20 at 22:43
Furthermore, in the first function, the compiler inlines the function call and is smart enough to transform the loop to a single `sum += EXAMPLES`. This benchmark is broken. — mcarton, May 20 '20 at 22:47
tip: 10 million loop iterations in 100ns should tell you that something isn't right with the benchmark given that a single cpu cycle is around 1/3 of a nanosecond. — Paul Hankin, May 21 '20 at 09:38

score 6 · Answer 1 · edited May 20 '20 at 22:48

6

Viewing the assembly, your first loop is optimized entirely into a single mov 10000000 instruction (that is, the compiler does something equivalent to sum += EXAMPLES) while the second is not. I do not know why the second loop is not constant-optimized as heavily.

edited May 20 '20 at 22:48

mcarton

27,633
5
85
95

answered May 20 '20 at 21:12

kmdreko

42,554
6
57
106

Schwern · Answer 2 · 2020-05-20T21:30:23.243

I see no difference in performance, as one would expect.

$ ./test
enum without fields func call sum: 10000000 at 307.543596ms
enum with field func call sum: 10000000 at 312.196195ms

$ rustc --version
rustc 1.43.1 (8d69840ab 2020-05-04)

$ uname -a
Darwin Windhund.local 18.7.0 Darwin Kernel Version 18.7.0: Mon Feb 10 21:08:45 PST 2020; root:xnu-4903.278.28~1/RELEASE_X86_64 x86_64 i386 MacBookPro15,2 Darwin

One problem might be the use of simple "wall clock" time for benchmarking. This simple count of how much time passed is vulnerable to anything else running which might consume resources. Anti-virus, a web browser, any program. Instead, use benchmark tests.

Why enum value binding in Rust is so slow?

2 Answers2