0

I am initialising an array and then reversing it a number of times to see the performance.

I want to understand if I have written code which cannot be compared or is Rust really so bad that it is taking so much time?

Here is the build and timing process of Rust:

rustc main.rs
time ./main

This runs and goes on and on. Which is surprising

Rust

fn reverse(mylist: &mut Vec<u16>) {
    let length = mylist.len();

    let mid_length = length / 2;

    for number in 0..mid_length {
        let mut a = mylist[number];
        let mut b = mylist[length - number - 1];

        mylist[number] = b;

        mylist[length - number - 1] = a;
    }
}

fn main() {
    let array_size = 100000;

    let iterations = 100000;

    let mut v = vec![0u16; array_size];

    for _ in 0..iterations {
        reverse(&mut v);
    }
}

Go

The Go code does exactly what Rust code is doing above. The important thing to note is that Go has Garbage Collection where as Rust does not. Surprisingly, Go does the job in less than 6 seconds:

go build main.go
time ./main 100000 100000

real    0m5.932s
user    0m5.928s
sys 0m0.004s

Go

package main

import (
    "os"
    "strconv"
)

func reverse(mylist []int) []int {

    length := len(mylist)
    half := int(length / 2)

    for i := 0; i < half; i++ {

        mylist[i], mylist[length-i-1] = mylist[length-i-1], mylist[i]

    }   

    return mylist

}

func main() {

    array_size, _ := strconv.Atoi(os.Args[1])
    iterations, _ := strconv.Atoi(os.Args[2])

    mylist := make([]int, array_size)

    for i := 0; i < iterations; i++ {
        reverse(mylist)

    }

}
Shepmaster
  • 388,571
  • 95
  • 1,107
  • 1,366
Tahseen
  • 1,027
  • 1
  • 12
  • 32
  • 1
    Why reinvent the wheel? [`Vec::reverse`](https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.reverse). – ljedrz Jul 05 '18 at 11:44
  • 1
    The first issue I can think of is that by default `rustc` compiles in Debug mode as far as I know. Repeat the experiment with the `-O` flag (for optimizations). – Matthieu M. Jul 05 '18 at 11:44
  • @ljedrz: Basic algorithm is similar (see https://doc.rust-lang.org/1.7.0/src/core/slice.rs.html#432), it simply uses `unsafe` to avoid bounds checking. – Matthieu M. Jul 05 '18 at 11:48
  • @MatthieuM. in any case you were right - it runs pretty fast with optimizations on. – ljedrz Jul 05 '18 at 11:49
  • @ljedrz: How fast? If the optimizer was clever enough, this should be a no-op. Go taking 6s for a no-op is slow as hell. – Matthieu M. Jul 05 '18 at 11:51
  • @ljedrz wanted to test on fundamental approach rather than using built-in functions – Tahseen Jul 05 '18 at 11:51
  • @Tahseen: This is not a built-in, I linked to the code so you can see its implementation. The only difference with your approach is the use of `unsafe` and `get_unchecked_mut` to avoid bounds-checking should the optimizer not be smart enough to realize it's unnecessary. – Matthieu M. Jul 05 '18 at 11:52
  • @MatthieuM. You were right about -O flag. If you reply, I will mark it as an Answer :-) Also I wanted to say, some how Java is able to manage sub 6 seconds for same thing – Tahseen Jul 05 '18 at 11:53
  • @MatthieuM. fast, but not noop-fast, perhaps due to bounds checking. – ljedrz Jul 05 '18 at 11:53
  • @MatthieuM. I mean Java is fastest even when it has GC. Faster than Go and Rust – Tahseen Jul 05 '18 at 11:55
  • @Tahseen: I think `rustc` could be improved to teach `LLVM` which alloc/dealloc functions it uses, and then LLVM would be able to optimize this to `fn main() {}` which would be faster than Java. It should also somehow indicate to LLVM that the content is all 0, as this would allow optimizing out the `reverse` itself. – Matthieu M. Jul 05 '18 at 12:20

1 Answers1

6

By default, rustc (and cargo) compile in Debug mode. It was judged the most useful default as one compiles most often during development.

Debug mode, especially in your code, includes a lot of checks: each access to mylist[..] is guarded by a bounds check, among other things.

Instead, for benchmarking purposes, you want to compile with optimizations. If using rustc directly, this is as simple as passing the -O flag.


As noted by @ljerdz, there is a [T]::reverse method, it is even more efficient than your implementation as it internally uses unsafe to elide bounds-checking.

A quick check on the playground, however, does not reveal any outlandish difference between your reverse and Vec::reverse in your case; the optimizer was already smart enough to elide the bounds checks.

Unfortunately it is not smart enough to elide the loop, likely because it does not realize that __rust_alloc_zeroed and __rust_dealloc are memory allocation/deallocation routines and have no observable side-effect.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • I tested Vec::reverse as mentioned by @ljerdz and am getting 2x improvement in performance. But again as I said, intent was to test performance using fundamentals. And your -O suggestion was the right one – Tahseen Jul 05 '18 at 12:18
  • @Tahseen Vec is not a fundamentals brick of rust. It have overhead when use it as you do. – Stargateur Jul 05 '18 at 12:25