0

Consider the following code, when axis=2, it should remove the duplicate of [1 1] to [1], but not. I wonder why it doesn't do unique operation on the 3rd axis.

arr = np.array([[[1,1], [1,1], [1,1]],
         [[7,1], [10,1], [10,1]],
         [[1,1], [1,1], [1,1]]])

print(np.unique(arr, axis=0))
print("----------------")
print(np.unique(arr, axis=1))
print("----------------")
print(np.unique(arr, axis=2))

I tried with many other examples, and it still not working on the 3rd axis.

Jiahao Li
  • 3
  • 1
  • 1
    Hi! Welcome to stackoverflow. Can you please post the output that you would want on this example and explain the logic? Right now you're basically saying "numpy.unique is not doing what I want" but we don't know what you want. – Stef Dec 22 '22 at 09:03
  • 2
    Perhaps this similar question can help: [Numpy row-wise unique elements](https://stackoverflow.com/questions/26958233/numpy-row-wise-unique-elements). Note that a numpy array cannot have rows of different lengths, so for instance you can't transform `[[1, 1], [2, 3]]` into `[[1], [2, 3]]` in numpy because `[[1], [2, 3]]` is not a valid numpy array. – Stef Dec 22 '22 at 09:08

1 Answers1

1

Note this from the documentation (citing help(np.unique)):

The axis to operate on. If None, ar will be flattened. If an integer, the subarrays indexed by the given axis will be flattened and treated as the elements of a 1-D array with the dimension of the given axis […]

When an axis is specified the subarrays indexed by the axis are sorted. […] The result is that the flattened subarrays are sorted in lexicographic order starting with the first element.

So in your case it will try to sort and compare the sub-arrays arr[:, :, 0].flatten() which is [ 1, 1, 1, 7, 10, 10, 1, 1, 1] with arr[:, :, 1].flatten() which is [1, 1, 1, 1, 1, 1, 1, 1, 1].

These are obviously not the same so no change is made except that the second is sorted before the first in a lexicographical comparison.

I assume what you wanted it to do is getting rid of the duplicate [1, 1] entries. However, np.unique cannot really work that way because these are arrays not lists. That behavior would result in different number of entries in arr[0] compared to arr[1] and that obviously cannot work.

Homer512
  • 9,144
  • 2
  • 8
  • 25
  • This is the right answer and this is why it's so important to try and ask the question as well as you can. If you had forced yourself to explicitly write down what output array you were expecting, you could probably have seen yourself that it's not possible. – yagod Dec 22 '22 at 09:14
  • Thank you so for the answer! I want to give you a useful comment, but it require 15 reputation.. Will do it in the future! – Jiahao Li Dec 22 '22 at 16:05