Why is my Rust program 4x slower than a Go program doing the same bitwise and I/O operations?

Question

I have a Rust program that implements a brute-force parity check for 64-bit unsigned integers:

use std::io;
use std::io::BufRead;

fn parity(mut num: u64) -> u8 {
    let mut result: u8 = 0;
    while num > 0 {
        result = result ^ (num & 1) as u8;
        num = num >> 1;
    }
    result
}

fn main() {
    let stdin = io::stdin();
    let mut num: u64;
    let mut it = stdin.lock().lines();
    // skip 1st line with number of test cases
    it.next();
    for line in it {
        num = line.unwrap().parse().unwrap();
        println!("{}", parity(num));
    }
}

When I feed it with input file containing 1000000 unsigned integers:

$ rustc parity.rs
$ time cat input.txt | ./parity &> /dev/null
cat input.txt  0.00s user 0.02s system 0% cpu 4.178 total
./parity &> /dev/null  3.87s user 0.32s system 99% cpu 4.195 total

And here comes a surprise - the effectively same program in Go does 4x faster:

$ go build parity.go
$ time cat input.txt | ./parity &> /dev/null
cat input.txt  0.00s user 0.03s system 3% cpu 0.952 total
./parity &> /dev/null  0.63s user 0.32s system 99% cpu 0.955 total

Here's the code in Go:

package main

import (
    "bufio"
    "fmt"
    "os"
    "strconv"
)

func parity(line string) uint64 {
    var parity uint64
    u, err := strconv.ParseUint(line, 10, 64)
    if err != nil {
        panic(err)
    }
    for u > 0 {
        parity ^= u & 1
        u >>= 1
    }
    return parity
}

func main() {
    scanner := bufio.NewScanner(os.Stdin)
    // skip line with number of cases
    if !scanner.Scan() {
        // panic if there's no number of test cases
        panic("missing number of test cases")
    }
    for scanner.Scan() {
        fmt.Println(parity(scanner.Text()))
    }
    if err := scanner.Err(); err != nil {
        fmt.Fprintln(os.Stderr, "reading standard input:", err)
    }
}

Versions:

$ rustc --version
rustc 1.7.0
$ go version
go version go1.6 darwin/amd64

Sample of input file, first line contains number of input values in the file:

8
7727369244898783789
2444477357490019411
4038350233697550492
8106226119927945594
1538904728446207070
0
1
18446744073709551615

Why do the Rust and Go programs I've written have such a dramatic difference in performance? I expected Rust to be a bit faster than Go in this case. Am I doing something wrong in my Rust code?

The Rust code will be significantly faster if you compile with optimisations: `rustc -O parity.rs`. — huon, Mar 20 '16 at 23:23
Compiling with optimizations is described in the [Getting Started](http://doc.rust-lang.org/stable/book/getting-started.html) section of [*The Rust Programming Language*](http://doc.rust-lang.org/stable/book/README.html), the section immediately after the 1-page introduction. — Shepmaster, Mar 20 '16 at 23:58
I get it taking 3 seconds unoptimised and 0.45 seconds optimised in Rust, and 0.6 seconds in Go. — Chris Morgan, Mar 21 '16 at 01:05
For what it's worth, I'd recommend using the built-in [`count_ones`](http://doc.rust-lang.org/std/primitive.i64.html#method.count_ones) method. Looks [like this](http://is.gd/tjLsyF). — Shepmaster, Mar 21 '16 at 01:26

score 6 · Accepted Answer · answered Mar 21 '16 at 06:19

6

I think you're not compiling with optimisation. try

$ rustc -O parity.rs

answered Mar 21 '16 at 06:19

tafia

1,512
9
18

You're so right, thanks! For others who will read this - there's another way to set optimization level - `rustc -C opt-level=3 parity.rs`, and `-O` is a shortcut for `-C opt-level=2`. – Alex Chekunkov Mar 21 '16 at 12:29
you can also use `cargo build --release`, I find it more convenient than rustc – tafia Mar 24 '16 at 03:06

peterSO · Answer 2 · 2016-03-21T01:05:39.253

0

Your benchmark doesn't measure the parity check. It measures input plus parity check plus output. For example, in Go, you measure scanner.Scan and strconv.ParseUint and fmt.Println as well as the parity check.

Here's a Go benchmark that just measures 1000000 parity checks.

parity_test.go:

package parity

import (
    "math/rand"
    "runtime"
    "testing"
)

func parity(n uint64) uint64 {
    var parity uint64
    for n > 0 {
        parity ^= n & 1
        n >>= 1
    }
    return parity
}

func init() { runtime.GOMAXPROCS(1) }

// Benchmark 1000000 parity checks.
func BenchmarkParity1000000(b *testing.B) {
    n := make([]uint64, 1000000)
    for i := range n {
        r := uint64(rand.Uint32())
        n[i] = (r << 32) | r
    }
    p := parity(42)
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        for _, n := range n {
            p = parity(n)
        }
    }
    b.StopTimer()
    _ = p
}

Output:

$ go test -bench=.
BenchmarkParity1000000        50      34586769 ns/op
$

edited Mar 21 '16 at 01:05

answered Mar 21 '16 at 00:57

peterSO

158,998
31
281
276

2

While interesting, this doesn't answer the question OP asked: *Why do the Rust and Go programs I've written have such a dramatic difference in performance*. This seems like it would be better as a comment on the question about the methodology of the benchmarking. Both Go and Rust versions are measuring code that performs the same functions. – Shepmaster Mar 21 '16 at 01:11
4

@Shepmaster: The OPs implicit question was why a "brute-force parity check for 64-bit unsigned integers" has such a dramatic difference in performance between Rust and Go. To paraphrase, there are lies, damn lies, and benchmarks. – peterSO Mar 21 '16 at 01:25
1

@peterSO "The OPs implicit question was why a "brute-force parity check for 64-bit unsigned integers" has such a dramatic difference in performance between Rust and Go" Just no. – Alex Chekunkov Mar 21 '16 at 12:22

Why is my Rust program 4x slower than a Go program doing the same bitwise and I/O operations?

2 Answers2

Linked

Related