1

When selecting the kth, largest element from a unsorted array, the run time comes out to be CN = 2 N + 2 k ln (N / k) + 2 (N – k) ln (N / (N – k)) (preposition taken from ALgorithms1 by Prof Robert Sedgewick). The implementation is a randomized quick select.

public static Comparable select(Comparable[] a, int k)
{
  StdRandom.shuffle(a);
  int lo = 0, hi = a.length - 1;
  while (hi > lo)
  {
    int j = partition(a, lo, hi);
    if (j < k) lo = j + 1;
    else if (j > k) hi = j - 1;
    else return a[k];
  }
  return a[k];
}

The partition subroutine is the quick sort partition implementation where the lo element is used as pivot to partition. shown below

private static int partition(Comparable[] a, int lo, int hi)
{
  int i = lo, j = hi+1;
  while (true)
  {
    while (less(a[++i], a[lo]))
      if (i == hi) break;
    while (less(a[lo], a[--j]))
      if (j == lo) break;
    if (i >= j) break;
      exch(a, i, j);
  }
  exch(a, lo, j);
  return j;
}

I see the run time for partition would be (N+1), and speculating the distribution is shuffled and each run would decrease the array to half the recurrence would come out to be (N+N/2+N/4..) so O(N) time can be assumed. This being a very optimistic scenario.

More likely the distribution can be slpit into halfs such as (C0,Cn-1) or (c1, cn-2)..probability of each being 1/n times. Now depending upon the k from either of the above split left side or right side could be selected. (That is either c0 or cn-1 if first case ocurred, with probability of 1/2N). Now cn would be changed to new length and the same steps would be repeated until j==k.

Could someone help in deriving the recurrence relation and the limits it would run under, and ultimately come to the above equation of Cn.

Note:- i see others answers on SO but they consider the rosy picture of splitting into half, which is very unlikely to happen. like this one Average Runtime of Quickselect

Rahul
  • 895
  • 1
  • 13
  • 26
  • why doesnt the other answer satisfy you? splitting into half is not actually required for the answer, it is only average case complexity, not worst-case complexity, so the reasoning is sound – Nikos M. Jun 06 '20 at 06:10
  • 1
    The second-most upvoted answer to the linked question looks correct. The accepted answer is, as you noted, flat-out wrong in its analysis. I'm sure the proof is in Sedgewick though... – Paul Hankin Jun 06 '20 at 10:36
  • Does this answer your question? [Average Runtime of Quickselect](https://stackoverflow.com/questions/5945193/average-runtime-of-quickselect) – Paul Hankin Jun 06 '20 at 10:38

0 Answers0