6

I have a numpy array:

foo = array([3, 1, 4, 0, 1, 0])

I want the top 3 items. Calling

foo.argsort()[::-1][:3]

returns

array([2, 0, 4])

Notice values foo[1] and foo[4] are equal, so numpy.argsort() handles the tie by returning the index of the item which appears last in the array; i.e. index 4.

For my application I can't have the tie breaking always bias the end of the array, so how can I implement a random tie break? That is, half the time I would get array([2, 0, 4]), and the other half I would get array([2, 0, 1]).

ali_m
  • 71,714
  • 23
  • 223
  • 298
BoltzmannBrain
  • 5,082
  • 11
  • 46
  • 79
  • 2
    Use `lexsort` or add random values for each to hack it, see [how to make argsort result to be random between equal values?](http://stackoverflow.com/questions/20197990/how-to-make-argsort-result-to-be-random-between-equal-values) – Eric Tsui Jul 11 '15 at 01:52
  • 1
    Thanks, I ended up going with `numpy.lexsort((numpy.random.random(foo.size), foo))[::-1][:3]` – BoltzmannBrain Jul 11 '15 at 03:40

1 Answers1

4

Here's one approach:

Use numpy.unique to both sort the array and remove duplicate items. Pass the return_inverse argument to get the indices into the sorted array that give the values of the original array. Then, you can get all of the indices of the tied items by finding the indices of the inverse array whose values are equal to the index into the unique array for that item.

For example:

foo = array([3, 1, 4, 0, 1, 0])
foo_unique, foo_inverse = unique(foo, return_inverse=True)

# Put largest items first
foo_unique = foo_unique[::-1]
foo_inverse = -foo_inverse + len(foo_unique) - 1

foo_top3 = foo_unique[:3]

# Get the indices into foo of the top item
first_indices = (foo_inverse == 0).nonzero()

# Choose one at random
first_random_idx = random.choice(first_indices)

second_indices = (foo_inverse == 1).nonzero()
second_random_idx = random.choice(second_indices)

# And so on...

numpy.unique is implemented using argsort, so a glance at its implementation might suggest a simpler approach.

codewarrior
  • 2,000
  • 14
  • 14
  • In fact, why did I even bother mentioning `numpy.unique`? You can get all of the first-place ties with `(foo == foo[foo.argsort()[::-1][0]]).nonzero()`. – codewarrior Jul 11 '15 at 01:54
  • Yeah, my answer is really dumb compared to the one in [how to make argsort result to be random between equal values?](http://stackoverflow.com/questions/20197990/how-to-make-argsort-result-to-be-random-between-equal-values) – codewarrior Jul 11 '15 at 02:16
  • Actually, your answer is also a good try. The essential is almost the same, to utilise random. – Eric Tsui Jul 11 '15 at 03:55