0

I'm preparing data for an image segmentation model. I have 5 classes per pixel that do not cumulatively cover the entire image so I want to create a 'null' class as the 6th class. Right now I have a one-hot encoded ndarray and a solution that makes a bunch of Python calls that I am looking to optimize. My sketch code right now:

arrs.shape
(25, 25, 5)

null_class = np.zeros(arrs.shape[:-1])
for i in range(arrs.shape[0]):
    for j in range(arrs.shape[1]):
        if not np.any(arrs[i][j] == 1):
            null_class[i][j] = 1

Ideally, I find a few-line and much more performant way of computing the null examples - my actual training data comes in 20K x 20K images and I'd like to compute and store all at once. Any advice?

Grr
  • 15,553
  • 7
  • 65
  • 85
Nikhil Shinday
  • 1,096
  • 1
  • 9
  • 21

1 Answers1

1

I believe you can do this with a combination of numpy.where and numpy.all. Using all to check for all zeros along the last dimension will give you a boolean array that is True where the null_class should be 1. I will use a (2,2,5) array for the sake of display.

arr = np.random.randint(0, 2, size=(2,2,5))
null_class = np.zeros(arr.shape[:-1])
arr[0, 0] = [0, 0, 0, 0, 0]
arr
array([[[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1]],

       [[0, 0, 1, 0, 0],
        [0, 1, 1, 1, 0]]])
np.all(arr[:, :] == 0, axis=2)
array([[ True, False],
       [False, False]], dtype=bool)
np.where(np.all(arr[:, :] == 0, axis=2))
(array([0]), array([0]))
null_class[np.where(np.all(arr[:, :] == 0, axis=2)] = 1
null_class
array([[ 1.,  0.],
       [ 0.,  0.]])
Grr
  • 15,553
  • 7
  • 65
  • 85
  • This is great and I learned that np.all and np.where have axis parameters from your code, but I actually found a more clear, albeit less memory efficient, solution. – Nikhil Shinday Jan 18 '18 at 18:41