0

I'm trying to code the optimal algorithm to select the ith element the bigger of the list. For example, if array = [4,3,5,7] and I search for the 2nd one, the function would return 4.

I'm assuming the list has only distinct numbers here

Here is the problem:

The function sometimes return None.

And here is my code (the first function works well I think).

from random import shuffle

def partition(array, leftend, rightend, pivot):
    """ I tested this one and it should work fine """
    i = leftend
    pivotindex = array.index(pivot)  # only works if all values in array unique
    array[pivotindex] = array[leftend]
    array[leftend] = pivot
    for j in range(leftend+1, rightend):
        if array[j] < pivot:
            temp = array[j]
            array[j] = array[i]
            array[i] = temp
            i += 1
    pivotindex = array.index(pivot)  # only works if all values in array unique
    leftendval = array[pivotindex]   # Take the value of the pivot
    array[pivotindex] = array[i]
    array[i] = leftendval
    return array

def RSelect(array, n, statistic_order):
    """ list * int * int
        statistic_order = the i th element i'm searching for """
    new_array = []                  # is used at the end of the function
    if n == 1:
        return array[0]
    array_temp = array              # Allows to have a shuffled list and
    shuffle(array_temp)
    pivot = array_temp[0]           # Allows to have a random pivot
    partition(array,0,n,pivot)
    j = array.index(pivot)


    if j == statistic_order:
        return pivot


    elif j > statistic_order:
        for k in range(0,j):
            new_array.append(array[k])
        RSelect(new_array,j,statistic_order)


    elif j < statistic_order:
        for k in range(j+1,n):
            new_array.append(array[k])
        RSelect(new_array,(n-j)-1,statistic_order-j)
  • About the None, they are due to your recursive calls of RSelect, you must return them. But there is another problem in your code, i'm investiguating – Thibault D. May 23 '18 at 13:01
  • If you have `4, 3, 5, 7` and you look for the second biggest one, why is it `4` and not `5` ? – ChatterOne May 23 '18 at 13:06

2 Answers2

0

Well a few things were wrong :

  • You need to return results in a recursive method, in every cases !
  • When j < statistic_order, you recursively work on the right part of the array, and you discard j+1 numbers, not j. Remember indices begin at 0 in python, not 1 !

I also changed a few things like useless parameters, or for loops that can be written with slices.

Here is the final code, check the changes to be sure you understand it.

RSelect :

def RSelect(array, statistic_order):
    """ list * int * int
    statistic_order = the i th element i'm searching for """
    n = len(array)
    if n == 1:
        return array[0]
    array_temp = array              # Allows to have a shuffled list and
    shuffle(array_temp)
    pivot = array_temp[0]           # Allows to have a random pivot
    array = partition(array,0,n,pivot)  # Changes here, does not impact the result, but for readability
    j = array.index(pivot)

    # print(array, j, statistic_order, end = '\t')
    if j == statistic_order:
        return pivot

    elif j > statistic_order:
        assert j > 0
        # print((array[0:j]), pivot)
        return RSelect(array[0:j],statistic_order)  # Changes here : return

    elif j < statistic_order:
        assert j+1 < n
        # print(pivot, (array[j+1:n]))
        return RSelect(array[j+1:n],statistic_order-j-1)  # Changes here : return, -j-1

main :

if __name__ == "__main__":
    from sys import argv
    if len(argv) > 1:
       n = int(argv[1])
    arr = [2, 1, 3, 5, 4]
    print(RSelect(arr[:], n))

It exists an other algorithm also in O(n) for this purpose : see this

EDIT : typos corrected & correction about complexity

Thibault D.
  • 1,567
  • 10
  • 23
  • Thanks! I will check it but looks clear. But why an assert ?? I also don’t understand the main. I still am a beginner in python :-) –  May 23 '18 at 14:26
  • No, this algorithm has a worst-case complexity O(n²), and average O(n). Shuffling the whole array instead of picking the pivot in a random position is a waste. –  May 23 '18 at 15:19
  • @nolwenb Assert is just an instruction to stop the program if the statement is false. You can remove them, they were just for me to make sure there won't be any empty list passed to the function. – Thibault D. May 23 '18 at 20:42
0

the code works fine but still, the result starts from 0. For example, if arr = [2,3,5,6] and I ask for RSelect(arr,4,2), the answer will be 5 and not 3. I don't know why.

Here is the code updated:

from random import shuffle

def partition(array, leftend, rightend, pivot):
    i = leftend
    pivotindex = array.index(pivot)  # only works if all values in array unique
    array[pivotindex] = array[leftend]
    array[leftend] = pivot
    for j in range(leftend+1, rightend):
        if array[j] < pivot:
            temp = array[j]
            array[j] = array[i]
            array[i] = temp
            i += 1
    pivotindex = array.index(pivot)  # only works if all values in array unique
    leftendval = array[pivotindex]   # Take the value of the pivot
    array[pivotindex] = array[i]
    array[i] = leftendval


def RSelect(array, n, statistic_order):
    """ list * int * int
        statistic_order = the i th element i'm searching for """
    if n == 1:
        return array[0]

    array_temp = array              # Allows to have a shuffled list
    shuffle(array_temp)
    pivot = array_temp[0]           # Allows to have a random pivot
    partition(array,0,n,pivot)
    j = array.index(pivot)


    if j == statistic_order:
        return pivot


    elif j > statistic_order:
        return RSelect(array[0:j],j,statistic_order)


    elif j < statistic_order:
        return RSelect(array[j+1:n],(n-j)-1,statistic_order-j-1)