1

I'd like to change the order of the column elements in

a = np.asarray(
[[0,1,1,2,2,2,2,3,3,3,4,4,4,4,4,4],
 [4,0,3,0,1,2,5,1,2,5,3,4,6,6,7,7],
 [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
 [0,1,0,0,1,1,1,0,0,0,1,1,0,1,0,1]]
)

based on the values of row 1-3 (0-based). My solution currently looks like this:

a[:, a.transpose()[:, 1].argsort(axis=0)]

array([[1, 2, 2, 3, 2, 3, 1, 4, 0, 4, 2, 3, 4, 4, 4, 4],
       [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1]])

which is fine, except that I'd like to also include rows 2-3 (lexicographically) in the search. Ideally, I would expect a result where the last row is [0, 1, 0, 1, ..., 0, 1] (the 2nd row which is full of zeroes should also be taken into account, but in this example it contains the same values).

orange
  • 7,755
  • 14
  • 75
  • 139

1 Answers1

3

You need numpy.lexsort, which is equivalent to argsort but based on multiple sorting keys; Given multiple arrays, it returns the index to sort the arrays in an order:

Given multiple sorting keys, which can be interpreted as columns in a spreadsheet, lexsort returns an array of integer indices that describes the sort order by multiple columns. The last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on. The keys argument must be a sequence of objects that can be converted to arrays of the same shape. If a 2D array is provided for the keys argument, it’s rows are interpreted as the sorting keys and sorting is according to the last row, second last row etc.

a[:, np.lexsort(a[:0:-1])]
#array([[2, 1, 3, 2, 3, 2, 1, 4, 0, 4, 3, 2, 4, 4, 4, 4],
#       [0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7],
#       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
#       [0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1]])
Psidom
  • 209,562
  • 33
  • 339
  • 356
  • Awesome - works great. Thanks! What's `a[:0:-1]` for (i.e. why is it using a reversed array)? – orange Mar 15 '17 at 14:12
  • 1
    It's reversed because conceptually there's a reversed relationship between the key importance(which one should be strictly ordered) and the sorting precedence(which one should be sorted firstly). The key sorted lastly is the primary key(strictly ordered). You want the second row to be your primary key, so it has to sorted at the end for all the three rows. – Psidom Mar 15 '17 at 14:16
  • Oh, it's just seems to be the way `lexsort` expects the order of keys (last one highest priority), as your quote already states ("The last key in the sequence is used for the primary sort order, the second-to-last key for the secondary sort order, and so on..."). Thanks for clarifying. – orange Mar 16 '17 at 05:30