0

I tried to implement quick select to find the m'th smallest number in the list. When I run the program it returns the correct values sometime and incorrect values other times on the same array. What am I doing wrong her is the code

def select_mth_smallest(A, m):
    pivot = np.random.choice(A)
    # split into 3 partitions
    A1 = [] 
    A2 = []
    A3 = []
    for item in A:
        if item < pivot:
            A1.append(item)
        if item > pivot:
            A3.append(item)
        else:
            A2.append(item)
    #find where m'th largest element is and recurse accordingly
    if len(A1) >= m:
        return select_mth_smallest(A1, m)
    elif (len(A1) + len(A2)) >= m:
        return pivot
    else:
        return select_mth_smallest(A3,m - (len(A1)+len(A2)))

Here is an input where the algorithm fails.

A = [1,2,3,4,5]

select_mth_smallest(A,5)

When I repeatedly execute this above statement I get, 5(correct) and 4(incorrect) alternatingly.

One thing that particularly baffles me (I am new to python) is that why I get different return values for the function when repeated with the same input. Looks rather sporadic.. BTW I am using Jupyter

vishmay
  • 386
  • 2
  • 4
  • 15
  • 5
    Look at your `if`s and `elif`s. One of those is wrong. – user2357112 May 02 '16 at 20:08
  • Sorry, I checked it seems to be correct. Can you be more specific. Thanks – vishmay May 02 '16 at 20:12
  • 1
    Is it the mth largest or smallest? Your title, function name, and comment seem to disagree. ~~Also what happens if pivot happens to be the smallest item in the current `A`? Then `A1` will be empty, as nothing is smaller than the pivot, and you'll be calling `select_mth_smallest([], m)`, in which case [`numpy.random.choice` will raise a `ValueError`](http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.random.choice.html).~~ There are a few of edge cases that you need to account for. What if the pivot is the mth smallest/largest element? Edit: misread the conditionals – KevinOrr May 02 '16 at 20:12
  • but this will not be executed since it is under the if clause and 0 is always less than a positive integer. – vishmay May 02 '16 at 20:16
  • @Prag1 yes, for some reason strikethrough is not parsed in comments – KevinOrr May 02 '16 at 20:20
  • @user2357112 Do you agree with me? Is there a bug in my code. The if elif statements appear to be correct, please correct me if I am wrong – vishmay May 02 '16 at 20:24
  • @KevinOrr what I mean by m'th smallest element is : when you sort it in ascending order it is the m'th element. Also if pivot is the m'th smallest element then I don't see a problem, the function will return after just one recursion. – vishmay May 02 '16 at 20:29
  • [Please provide an input where the algorithm fails.](http://stackoverflow.com/help/mcve) – chepner May 02 '16 at 20:33
  • @Prag1 okay, but `A2` is going to contain all elements <= `pivot`. Often, you'll be returning `pivot` even if it isn't the mth *smallest* element. – KevinOrr May 02 '16 at 20:36
  • @chepner thanks! I have edited the question including the input.. – vishmay May 02 '16 at 20:36
  • @KevinOrr all elements in A2 are equal to the pivot and according to the conditionals' logic I don't think I will be returning pivot incorrectly – vishmay May 02 '16 at 20:39
  • 1
    @Prag1 no, the elements in A2 are those that are *not greater than* the pivot. Using a debugger or adding a couple of print statements would make this evident. As user2357112 said, check your `if` statements, especially `if item > pivot`. When will the else clause be called? – KevinOrr May 02 '16 at 20:45
  • @Prag1 as for why you get different results on each invocation, that is because your function is non-deterministic. What do you think the source of randomness is? – KevinOrr May 02 '16 at 20:48
  • Got it ! thanks. I should have written an elif item > pivot. Thanks a lot. – vishmay May 02 '16 at 20:52

1 Answers1

2

You are adding some items to multiple partitions.

    if item < pivot:
        A1.append(item)
    if item > pivot:
        A3.append(item)
    else:
        A2.append(item)

A1 is the set of items less than the pivot. A3 is the set of items greater than the pivot. A2, however, is the set of items less than or equal to the pivot, because the 2nd if statement executes for all items, and one or the other branch will execute.

You want one, single if statement with an elif and else clause here.

    if item < pivot:
        A1.append(item)
    elif item > pivot:
        A3.append(item)
    else:
        A2.append(item)
chepner
  • 497,756
  • 71
  • 530
  • 681