1

I am trying to delete all rows in which there is one or less non-zero elements, in multiple 2D arrays contained within the list 'a'.

This method works when I run it outside the 'i' loop, but does not as a whole. I know that I cannot delete rows over which I am iterating, but I believe that I am not doing so in this case, because I am only deleting rows in arrays contained in a, not the arrays themselves.

for i in range(len(a)):
  del_idx=[]
  for j in range(len(a[i])):
    nonzero=np.nonzero(a[i][j])
    nonzero_len=len(nonzero[0]) #because np.nonzero outputs a tuple
    if nonzero_len<=1:
        del_idx.append(j)
    else:
        continue
  np.delete(a[i],(del_idx),axis=0)

Anyone know what's going on here? If this really does not work, how can I delete these elements without using a loop? This is Python 2.7

Thank you!

Alexis BL
  • 13
  • 1
  • 4

2 Answers2

1

You should aim to avoid for loops with NumPy when vectorised operations are available. Here, for example, you can use Boolean indexing:

import numpy as np

np.random.seed(0)

A = np.random.randint(0, 2, (10, 3))

res = A[(A != 0).sum(1) > 1]

array([[0, 1, 1],
       [0, 1, 1],
       [1, 1, 1],
       [1, 1, 0],
       [1, 1, 0],
       [0, 1, 1],
       [1, 1, 0]])

The same logic can be applied for each array within your list of arrays.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • Works great, thanks. How would you do the same thing across the '0' axis? Changing .sum(1) to .sum(0) raises "boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 3" – Alexis BL Nov 02 '18 at 02:37
  • @AlexisBL, `A[:, (A != 0).sum(0) > 1]` – jpp Nov 02 '18 at 02:38
0

You can use np.where() for indexing:

a = np.random.randint(0, 2, size=(10,10))
# array([[1, 1, 0, 0, 0, 0, 0, 1, 1, 1],
#    [1, 0, 0, 0, 1, 1, 1, 1, 0, 1],
#    [1, 0, 1, 0, 0, 1, 0, 0, 0, 1],
#    [1, 0, 0, 1, 0, 1, 0, 1, 1, 0],
#    [1, 0, 0, 0, 1, 0, 1, 1, 0, 1],
#    [0, 0, 1, 1, 1, 0, 1, 0, 0, 0],
#    [1, 0, 0, 1, 1, 0, 0, 1, 1, 0],
#    [0, 0, 0, 1, 0, 1, 0, 1, 1, 1],
#    [0, 0, 1, 1, 0, 0, 1, 0, 1, 0],
#    [1, 1, 0, 0, 0, 1, 0, 0, 1, 1]])

np.where(np.count_nonzero(a, axis=1)<5)    # In your case, should be > 1
# (array([2, 5, 8]),)

a[np.where(np.count_nonzero(a, axis=1)<5)] # Returns the array you wanted
# array([[1, 0, 1, 0, 0, 1, 0, 0, 0, 1],
#    [0, 0, 1, 1, 1, 0, 1, 0, 0, 0],
#    [0, 0, 1, 1, 0, 0, 1, 0, 1, 0]])
Kevin Fang
  • 1,966
  • 2
  • 16
  • 31