1

I implemented a basic sorting algorithm in Java, and compared its performance to those of native methods (Arrays.sort() and Arrays.parallelSort()). The program is as follows.

 public static void main(String[] args) {
    // Randomly populate array
    int[] array = new int[999999];
    for (int i = 0; i < 999999; i++)
        array[i] = (int)Math.ceil(Math.random() * 100);

    long start, end;

    start = System.currentTimeMillis();
    Arrays.sort(array);
    end = System.currentTimeMillis();
    System.out.println("======= Arrays.sort: done in " + (end - start) + " ms ========");

    start = System.currentTimeMillis();
    Arrays.parallelSort(array);
    end = System.currentTimeMillis();
    System.out.println("======= Arrays.parallelSort: done in " + (end - start) + " ms ========");

    start = System.currentTimeMillis();
    orderArray(array);
    end = System.currentTimeMillis();
    System.out.println("======= My way: done in " + (end - start) + " ms ========");
}


private static int[] orderArray(int[] arrayToOrder) {
    for (int i = 1; i < arrayToOrder.length; i++) {
        int currentElementIndex = i;
        while (currentElementIndex > 0 && arrayToOrder[currentElementIndex] < arrayToOrder[currentElementIndex-1]) {
            int temp = arrayToOrder[currentElementIndex];
            arrayToOrder[currentElementIndex] = arrayToOrder[currentElementIndex-1];
            arrayToOrder[currentElementIndex-1] = temp;
            currentElementIndex--;
        }
    }
    return arrayToOrder;
}

When I run this program, my custom algorithm consistently outperforms the native queries, by orders of magnitude, on my machine. Here is a representative output I got:

======= Arrays.sort: done in 67 ms ========
======= Arrays.parallelSort: done in 26 ms ========
======= My way: done in 4 ms ========

This is independent of:

  • The number of elements in the array (999999 in my example)
  • The number of times the sort is performed (I tried inside a for loop and iterated a large number of times)
  • The data type (I tried with an array of double instead of int and saw no difference)
  • The order in which I call each ordering algorithm (does not affect the overall difference of performance)

Obviously, there's no way my algorithm is actually better than the ones provided with Java. I can only think of two possible explanations:

  • There is a flaw in the way I measure the performance
  • My algorithm is too simple and is missing some corner cases

I expect the latter is true, seen as I used a fairly standard way of measuring performance with Java (using System.currentTimeMillis()). However, I have extensively tested my algorithm and can find no fallacies as of yet - an int has predefined boundaries (Integer.MIN_VALUE and MAX_VALUE) and cannot be null, I can't think of any possible corner case I've not covered.

My algorithm's time complexity (O(n^2)) and the native methods' (O(n log(n)))), which could obviously cause an impact. Again, however, I believe my complexity is sufficient...

Could I get an outsider's look on this, so I know how I can improve my algorithm?

Many thanks,

Chris.

Chris Neve
  • 2,164
  • 2
  • 20
  • 33
  • 5
    You're sorting an array in place, but you didn't re-scramble the array between each trail. – flakes Nov 09 '18 at 12:46
  • Possible duplicate https://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array – Peter Lawrey Nov 09 '18 at 12:51
  • 1
    Means: You sort an already sorted array after the first sort call. – Seelenvirtuose Nov 09 '18 at 12:52
  • Note: you need to ensure the code has warming up.I would run this repeatedly until the code is no longer warming up. e.g. run the whole sets of tests multiple times and ignore the first 30 - 120 seconds. – Peter Lawrey Nov 09 '18 at 12:52
  • bubble sorting an array already sorted is O(n) – Peter Lawrey Nov 09 '18 at 12:53
  • Thanks for the answers guys. Seems so obvious with hindsight. Having scrambled the array, my algorithm is now way slower than the native ones. – Chris Neve Nov 09 '18 at 13:15
  • @PeterLawrey, what do you mean by the code warming up? Care to expand on this a little? – Chris Neve Nov 09 '18 at 13:16
  • 1
    When you first run code it is interpreted after it has run for a while e.g. 10,000 loops, it is compiled in the background to native code in one or two stages. Only after the code is full compiled to native code do you see how fast it can be. – Peter Lawrey Nov 09 '18 at 13:42
  • O(N^2) is significantly **more** than O(N log N) so it would be slower for large N except in specific cases like this one. – Peter Lawrey Nov 09 '18 at 13:44

1 Answers1

4

You're sorting an array in place, but you didn't re-scramble the array between each trail. This means you're sorting the best case scenario. In between each call to to an array sorting method you can re-create the array.

for (int i = 0; i < TEST_SIZE; i++)
    array[i] = (int)Math.ceil(Math.random() * 100);

After doing this you will notice your algorithm is about 100 times slower.

That said, this is not the best way to compare the methods in the first place. At a minimum you should be sorting the same original array for each different algorithm. You should also perform multiple iterations over each algorithm and average the response. The result from a single trial will be spurious and not reliable as a good comparison.

flakes
  • 21,558
  • 8
  • 41
  • 88
  • 1
    Well, oddly enough, I was convinced that the order in which I called each algorithm had no impact on the benchmarks... but now that I implemented scrambling of the array between sorting, the benchmarks do indeed match what you describe. It also makes sense, as you say, to sort over the same original array for each algorithm, can't believe I didn't think of that... Thanks for your answer. – Chris Neve Nov 09 '18 at 13:14