-3

After migrating one of my computing heavy backend programs from Java to Go, I find that the performance degraded instead of improving. I tested around some and it seems the array sorting code is the culprit (which I used heavily in my program). I have written the below two simplified programs to do a comparison, and somehow it seems Go's built-in sort function is a lot slower than Java's Arrays.sort method?

package main

import (
    "fmt"
    "math/rand"
    "sort"
    "time"
)

func main() {
    fmt.Println("Starting")
    const x = 1000000
    const y = x * 10
    var s [y]float64
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)
    start1 := time.Now()
    for i := 0; i < y; i++ {
        s[i] = r1.Float64()
    }
    end1 := time.Since(start1)
    ss := s[:]
    start2 := time.Now()
    sort.Float64s(ss)
    end2 := time.Since(start2)
    fmt.Println(end1)
    fmt.Println(end2)
    fmt.Println("Number: ", ss[x])
}

and it produces results like this:

Starting
136.6331ms  // The time taken to generate 10,000,000 random numbers
3.456781s   // The time taken to sort the 10,000,000 random numbers
Number:  0.10000285497001288

While with the Java program here

import java.util.*;

class RSTest {
    public static void main(String[] args) {
        System.out.println("Starting");
        int x = 1000000;
        int y = x * 10;
        Random gen = new Random(System.currentTimeMillis());
        double[] s = new double[y];
        long start1 = System.nanoTime();
        for (int i = 0; i < y; i++) {
            s[i] = gen.nextDouble();
        }
        long end1 = System.nanoTime();
        long start2 = System.nanoTime();
        Arrays.sort(s);
        long end2 = System.nanoTime();
        System.out.println((end1 - start1) / (1000000000.0));
        System.out.println((end2 - start2) / (1000000000.0));
        System.out.println(s[x]);
    }
}

the results are like this

Starting
0.2252634  // The time taken to generate 10,000,000 random numbers
1.0303157  // The time taken to sort the 10,000,000 random numbers
0.0999513608326642

the Go program takes around 130ms to generate 10 million random numbers and assign them to an array while Jave takes around 230ms to generate 10 million random numbers and assign them to an array, this part I think is the improvement I expect from going from Java to Go.

But for the sorting part, it took Java only around 1s to sort the 10 million random numbers but it took Go around 3.5s to do the 10 million random number sort? And this is quite consistent from multiple runs of the test.

So does that mean Go's built-in sort function is really this much inferior to Java's Arrays.sort method? Or did I use Go's sort function wrong? Or something is wrong with my programs?

Thanks.

Note: this is from Go 1.11 and Java 8, the current versions I'm running on my server. Also, please note that the two programs I posted here are purely for testing purposes that I have written in a couple minutes so may (or rather, most certainly do) contain some code that doesn't make much sense for real production systems.

Some Update:

Thanks to @nussjustin's suggestion, I tried sort.Slice with some promising results.

As currently I'm out of office and using a slower notebook, the baseline results for the two above tests are like this now:

For the Java Arrays.sort test

Starting
0.3590694
1.6030528 // The time taken to sort the 10,000,000 random numbers
0.10000905418967532

For the Go sort.Float64s test

Go
Starting
233.1957ms
5.4633992s // The time taken to sort the 10,000,000 random numbers
Number:  0.10002801819954663

And now after modifying the Go test with sort.Slice

package main

import (
    "fmt"
    "math/rand"
    "sort"
    "time"
)

func main() {
    fmt.Println("Starting")
    const x = 1000000
    const y = x * 10
    var s [y]float64
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)
    start1 := time.Now()
    for i := 0; i < y; i++ {
        s[i] = r1.Float64()
    }
    end1 := time.Since(start1)
    ss := s[:]
    start2 := time.Now()
    sort.Slice(ss, func(i, j int) bool { return ss[i] < ss[j] })
    end2 := time.Since(start2)
    fmt.Println(end1)
    fmt.Println(end2)
    fmt.Println("Number: ", ss[x])
}

The result is a big improvement over sort.Float64s, but still not as good as Java's array sort

Starting
281.4262ms
3.6745684s // The time taken to sort the 10,000,000 random numbers
Number:  0.10010604106864159

And I think someone complained that there is only 1 distribution for the tests (who later removed his comment), I tested for sorting normal distribution of random numbers too (albeit I'd say such a huge performance difference in sorting uniform distribution of random numbers is already quite a bad sign since the algorithms of sorting uniform distribution of random numbers should be quite mature already)

I just replace the random number generator from uniform distribution to normal distribution like this

Go:

s[i] = r1.NormFloat64()

Java:

s[i] = gen.nextGaussian();

And the result of Java's Arrays.sort method is

Starting
1.4126348
1.6118655
-1.2820310313627319

And Go's sort.Slice

Starting
434.9106ms
3.8936811s
Number:  -1.2818667132095363

So Go is sort.Slice is still about twice slower than Java's Arrays.sort, same as for uniform distribution of random numbers. The good thing is that in generating the normal distribution of random numbers, Go is three times faster than Java, compared to about 70% faster in generating uniform distribution of numbers.

hellopeach
  • 924
  • 8
  • 15
  • 5
    Please write valid benchmarks before you worry about performance. See [this post](https://stackoverflow.com/q/30202611/13860). Go provides a perfectly usable, and useful, benchmarking mechanism. You should use it when benchmarking Go code. – Jonathan Hall Mar 20 '19 at 14:32
  • Is `Float64` an object or primitive type in Go? – Aaron Digulla Mar 20 '19 at 14:33
  • 1
    @AaronDigulla: float64 in go is a primitive type, but the sort package works through an interface abstraction, so there is always some overhead for the benefit of generalization. – JimB Mar 20 '19 at 14:35
  • Take the benchmarks from package sort and port them to Java. This might be interesting. – Volker Mar 20 '19 at 14:44
  • 3
    Did you try sort.Slice? sort.Slice avoids some of the interface overhead of sort.Sort (which is used by sort.Float64s). If that doesn't help try profiling your code to see where the time is spent (https://blog.golang.org/profiling-go-programs) – nussjustin Mar 20 '19 at 14:50
  • 2
    The sort function is not necessarily "inferior", the overhead comes from the package providing a generic sort. If you encounter a problem where sorting huge numbers of float64s is your performance bottleneck, then you can implement or find a package for sorting float64s directly, rather than through this abstraction. – JimB Mar 20 '19 at 15:55
  • @JimB, what we are comparing to here is an even more generic Java Array.sort method which surely has more abstraction than the Go sort function? After all Java is almost regarded as the most prominent example of over-abstraction nowadays. I guess I can try write a quick sort directly and test it but I doubt the little abstraction Go does here is the source of such a huge performance drop when compared to the much more abstracted Java method. – hellopeach Mar 20 '19 at 16:18
  • 1
    @Flimzy, your comment is still invalid and irrelevant here since you still have not pointed out which part of the two blocks of testing code is so-called "invalid" for you. Or maybe you can take a couple minute to write your so-called "valid benchmark" to test the result for yourself and post some valid comments. Thanks. – hellopeach Mar 20 '19 at 16:23
  • @hellopeach: `Go` does not have generics, which means specialization needs to be done manually. Yes, go could implement specialized sorts for the helper functions like `sort.Float64`, but for most cases the "generic" (meaning using the `sort.Interface`, not "generics") implementation is adequate. Generating specialized sort from the reference implementation (maybe possible by adapting the generation used for `sort.Slice`) would be interesting exercise. – JimB Mar 20 '19 at 16:24
  • @JimB, and? what? still the Java's Arrays.sort method looks more generic (nothing about the "generics programming" paradigm of course, which was not what I meant in the first place) and more abstracted than the Go's sort.Float64s function, so if abstraction is the problem here, it should affect Java's sorting method even more. – hellopeach Mar 20 '19 at 16:29
  • @Volker, yup it'd be interesting, but right now I'm more concerned about the performance degradation after migrating from Java to Go so my hands are kinda full here now. I'll have to find the reason for the performance degradation ASAP and find a fix or optimization, or have to revert everything back to Java, or maybe off-load the computation heavy (ie. sorting heavy) part to numpy or something... – hellopeach Mar 20 '19 at 16:42
  • Sorting doubles (float64) would be faster using radix sort, but only if you can cast a double to a 64 bit integer without actually modifying the data. C / C++ can do this, and if not dealing with special cases like NAN, then doubles can be treated as 64 bit sign + magnitude integers for the purpose of sorting. – rcgldr Mar 20 '19 at 17:30
  • Go is just not yet as efficient as Java today is. – Coconut Apr 29 '19 at 15:09

2 Answers2

2

thanks for @JimB and @nussjustin's suggestions, I wrote a simple quicksort implementation myself, and it worked the magic!

package main

import (
    "fmt"
    "math/rand"
    "time"
)

func qsort(s []float64) []float64 {
    if len(s) < 2 {
        return s
    }

    left, right := 0, len(s)-1

    pivot := 0

    s[pivot], s[right] = s[right], s[pivot]

    for i := range s {
        if s[i] < s[right] {
            s[left], s[i] = s[i], s[left]
            left++
        }
    }

    s[left], s[right] = s[right], s[left]

    qsort(s[:left])
    qsort(s[left+1:])

    return s
}

func main() {
    fmt.Println("Starting")
    const x = 1000000
    const y = x * 10
    var s [y]float64
    s1 := rand.NewSource(time.Now().UnixNano())
    r1 := rand.New(s1)
    start1 := time.Now()
    for i := 0; i < y; i++ {
        s[i] = r1.NormFloat64()
    }
    end1 := time.Since(start1)
    ss := s[:]
    start2 := time.Now()
    ss = qsort(ss)
    end2 := time.Since(start2)
    fmt.Println(end1)
    fmt.Println(end2)
    fmt.Println("Number: ", ss[x])
}

with this super crude quicksort, now I'm able to achieve the following results

Starting
276.763ms
1.589941s
Number:  -1.281875446690731 

now it's consistently around 15% faster than Java's Arrays.sort method!

I also implemented a quicksort method specifically for array of double in Java to replace Arrays.sort method to see if I can get any performance gain, the performance ends up around the same as Arrays.sort, still around 10% to 15% slower than Go. It seems the Arrays.sort somehow already achieves the best performance possible in Java, and you don't gain anything by stripping away the abstractions.

So I guess the lesson is that if you want performance in Go's sorting, then implement a quicksort function yourself, don't use the built-in sort functions, even the sort.Slice is around twice slower than a self-written sort function, and the sort.Float64s is more than three times (sometimes four times) slower!

I guess these results can finally shut those commenters up about their so-called "invalid benchmark" stuff. Like I said, the performance degradation after migrating from Java to Go is quite real for my production system, and I'd be in a pinch if it cannot be fixed ASAP, now hopefully after replacing all those sort functions, we can finally see a decent performance improvement on our production system so I can have a peace sleep tonight :)

hellopeach
  • 924
  • 8
  • 15
  • Using go's unsafe.pointer, you could treat an array of float64's as an array of uint64's, and then use radix sort. Java doesn't have an equivalent, but C / C++ does. Other than special values like NAN, the float64's treated as uint64's would actually be treated as 64 bit sign + magnitude integers for the purpose of radix sort. – rcgldr Mar 20 '19 at 22:36
-2

Twenty years ago, Java 1.1 was slow. Since then, thousands of people have put their minds to the task to fix that. Today, Java code is usually on par or faster than C++. With Java 12 and GraalVM, we're all going to see another boost.

Bad code in Java can be slow but the same is true for C++. Java doesn't come with a brain, you have to use your own :-)

To answer the question, the code looks correct. My guess is that Java's sort implementation has in fact been optimized to the brink with data from thousands of use cases. Just look at the length: ~3000 lines with tons of corner cases compared to 500 in Go.

Aaron Digulla
  • 321,842
  • 108
  • 597
  • 820
  • 4
    The Java implementation is so long because it contains overloaded functions for all the primitive integer types, not because it is so much fancier. – Volker Mar 20 '19 at 15:42
  • @Aaron, I doubt it, since this is just a simple array sort on a uniform distribution of random numbers, which I think mathematically has little room for optimization already, maybe some optimization can yield some small improvements, but three times the performance? And coupled to the fact that Go compiled to native binary where Java still runs on a VM (and we are talking about the not so new Java 8 here, not bleeding edge Java 12 and stuff) That's why I think something must be wrong here, just not sure where yet... – hellopeach Mar 20 '19 at 16:35
  • @Volker The code of the individual methods is also much longer than the ~40 lines of the core sort algorithm in Go. – Aaron Digulla Mar 20 '19 at 16:52
  • @hellopeach It depends what optimizations the VM can do before running the code. Go is compiled to generic x86 code. Java is compiled to the code for your specific CPU. The only other option is that Go uses an indirect array. The Java double[] array should be a continuous range of memory. If Go uses pointer indirection for each element, that will cost a lot of time but I doubt it. – Aaron Digulla Mar 20 '19 at 16:54