-1

I try to implement the Hoare partition scheme as a part of a Quickselect algorithm but it seems to give me various answers each time.

This is the findKthBest function that finds the Kth largest number in an array given an array (data) and the number of elements in it (low = 0, high = 4 in case of 5 elements):

def findKthBest(k, data, low, high):
    # choose random pivot
    pivotindex = random.randint(low, high)

    # move the pivot to the end
    data[pivotindex], data[high] = data[high], data[pivotindex]

    # partition
    pivotmid = partition(data, low, high, data[high])

    # move the pivot back
    data[pivotmid], data[high] = data[high], data[pivotmid]

    # continue with the relevant part of the list
    if pivotmid == k:
        return data[pivotmid]
    elif k < pivotmid:
        return findKthBest(k, data, low, pivotmid - 1)
    else:
        return findKthBest(k, data, pivotmid + 1, high)

The function partition() gets four variables:

  • data (a list, of for example 5 elements),
  • l (the start position of the relevant part in the list, for example 0)
  • r (the end position of the relevant part in the list, where also the pivot is placed, for example 4)
  • pivot (the value of the pivot)
def partition(data, l, r, pivot):
    while True:
        while data[l] < pivot:
            #statistik.nrComparisons += 1
            l = l + 1
        r = r - 1    # skip the pivot
        while r != 0 and data[r] > pivot:
            #statistik.nrComparisons += 1
            r = r - 1
        if r > l:
            data[r], data[l] = data[l], data[r]
        return r

Right now I simply get various results each time and it seems that the recursion doesn't work so well (sometimes it ends with reaching max-recursion error), instead of giving a constant result each time. What am I doing wrong?

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Hoare partition scheme normally uses the middle element as pivot. The first scans depend on element == pivot to stop them if there are no "out of place" elements detected. Otherwise bounds checks to prevent scanning beyond the ends of a sub-array are needed. – rcgldr Apr 10 '19 at 12:44

1 Answers1

0

First, there appears to be an mistake in the function partition()

If you compare your code with the one in wiki carefully, you will find the difference. The function should be:

def partition(data, l, r, pivot):
    while True:
        while data[l] < pivot:
            #statistik.nrComparisons += 1
            l = l + 1
        r = r - 1    # skip the pivot
        while r != 0 and data[r] > pivot:
            #statistik.nrComparisons += 1
            r = r - 1
        if r >= l:
            return r

        data[r], data[l] = data[l], data[r]

Second, for example:

  • You get an array data = [1, 0, 2, 4, 3] with pivotmid=3 after partition
  • You want to find the 4th largest value (k=4), which is 1

The next array data parsing to findKthBest() will become [1, 0].
Therefore, the next findKthBest() should find the largest value of the array [1, 0] :

def findKthBest(k, data, low, high):
    ......

    # continue with the relevant part of the list
    if pivotmid == k:
        return data[pivotmid]
    elif k < pivotmid:
        #Corrected
        return findKthBest(k-pivotmid, data, low, pivotmid - 1)
    else:
        return findKthBest(k, data, pivotmid + 1, high)
  • Thanks! However during the while loops l and r get values that are larger than the places in the list so it gets an IndexError while trying to check data[l] < pivot or data[r] > pivot – RedSquare99 Apr 11 '19 at 11:02