4

I've noticed that when sorting an array in .NET using a custom IComparer<T>, requests are made for comparisons of an item against itself.

Why is this the case? Surely it's a trivial optimization to see if a comparison is about to be made of identical indexes, and assume the result must be zero?

Example code:

class Comparer : IComparer<string>
{
  public int Compare(string x, string y)
  {
    Console.WriteLine("{0} vs {1}", x, y);

    return string.Compare(x, y);
  }
}

static void Main(string[] args)
{
  var values = new[] {"A", "D", "C", "B", "E"};

  Array.Sort(values, new Comparer());
}

With output (strange comparisons marked):

A vs C
A vs E
C vs E
A vs C
D vs C
C vs E
C vs B
C vs C   ***
C vs C   ***
A vs B
A vs B
A vs A   ***
A vs B
A vs A   ***
D vs E
D vs E
D vs D   ***
D vs E
D vs D   ***
stusmith
  • 14,003
  • 7
  • 56
  • 89

1 Answers1

3

People report different outcomes because the Array.Sort() algorithm was changed several times. At least in .NET 4.0 and again in .NET 4.5, possibly before that. The latest and greatest version switched from QuickSort to Introsort.

You are seeing an element compared by itself due to a counter-measure against Quicksort's very poor worst case behavior, O(n^2). The Wikipedia article for Introsort explains it well:

In quicksort, one of the critical operations is choosing the pivot: the element around which the list is partitioned. The simplest pivot selection algorithm is to take the first or the last element of the list as the pivot, causing poor behavior for the case of sorted or nearly sorted input. Niklaus Wirth's variant uses the middle element to prevent these occurrences, degenerating to O(n²) for contrived sequences. The median-of-3 pivot selection algorithm takes the median of the first, middle, and last elements of the list; however, even though this performs well on many real-world inputs, it is still possible to contrive a median-of-3 killer list that will cause dramatic slowdown of a quicksort based on this pivot selection technique. Such inputs could potentially be exploited by an aggressor, for example by sending such a list to an Internet server for sorting as a denial of service attack.

You are seeing the side-effects of the the median-of-3 pivot selection algorithm.

stusmith
  • 14,003
  • 7
  • 56
  • 89
Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536