When selecting the kth, largest element from a unsorted array, the run time comes out to be CN = 2 N + 2 k ln (N / k) + 2 (N – k) ln (N / (N – k)) (preposition taken from ALgorithms1 by Prof Robert Sedgewick). The implementation is a randomized quick select.
public static Comparable select(Comparable[] a, int k)
{
StdRandom.shuffle(a);
int lo = 0, hi = a.length - 1;
while (hi > lo)
{
int j = partition(a, lo, hi);
if (j < k) lo = j + 1;
else if (j > k) hi = j - 1;
else return a[k];
}
return a[k];
}
The partition subroutine is the quick sort partition implementation where the lo element is used as pivot to partition. shown below
private static int partition(Comparable[] a, int lo, int hi)
{
int i = lo, j = hi+1;
while (true)
{
while (less(a[++i], a[lo]))
if (i == hi) break;
while (less(a[lo], a[--j]))
if (j == lo) break;
if (i >= j) break;
exch(a, i, j);
}
exch(a, lo, j);
return j;
}
I see the run time for partition would be (N+1), and speculating the distribution is shuffled and each run would decrease the array to half the recurrence would come out to be (N+N/2+N/4..) so O(N) time can be assumed. This being a very optimistic scenario.
More likely the distribution can be slpit into halfs such as (C0,Cn-1) or (c1, cn-2)..probability of each being 1/n times. Now depending upon the k from either of the above split left side or right side could be selected. (That is either c0 or cn-1 if first case ocurred, with probability of 1/2N). Now cn would be changed to new length and the same steps would be repeated until j==k.
Could someone help in deriving the recurrence relation and the limits it would run under, and ultimately come to the above equation of Cn.
Note:- i see others answers on SO but they consider the rosy picture of splitting into half, which is very unlikely to happen. like this one Average Runtime of Quickselect