4

Hello I have following data

ids = np.concatenate([1.0 * np.ones(shape=(4, 9,)), 
                      2.0 * np.ones(shape=(4, 3,))], axis=1)

logits = np.random.normal(size=(4, 9 + 3, 256))

Now I want to get numpy array only of ids that have 1.0 and I want to get array of size (4,9, 256)

I tried logits[ids == 1.0, :] but I get (36, 256) How I can make slicing without connecting first two dimensions ?

Current dimensions are only example ones and I am looking for generic solution.

Night Walker
  • 20,638
  • 52
  • 151
  • 228
  • What you are trying to do is not possible as it is. But you can fill the value where conditon doesn't get satisfied with something else such that it doesn't affect your further processing. Refer https://stackoverflow.com/questions/29046162/numpy-array-loss-of-dimension-when-masking – ggaurav Jan 17 '21 at 16:14
  • 1
    In your example, you are only masking the 2nd dimension. So you could reduce the mask to a 1d array: `logits[:,(ids==1.0).all(axis=0),:]`. It's up to you to supply the added information about the distribution of `True`, whether that means `reshape` after, or modifying the mask before hand. – hpaulj Jan 17 '21 at 17:16

1 Answers1

1

Your question appears to assume that each row has the same number of nonzero entries; in that case you can solve your problem generally like this:

mask = (ids == 1)
num_per_row = mask.sum(1)

# same number of entries per row is required
assert np.all(num_per_row == num_per_row[0])  

result = logits[mask].reshape(logits.shape[0], num_per_row[0], logits.shape[2])

print(result.shape)
# (4, 9, 256)
jakevdp
  • 77,104
  • 11
  • 125
  • 160