Numpy array exclude some elements

Question

training_images = np.array([i for i in images if i not in validation_images])

The above is wrong (as noted in a comment below). What's a correct and faster way of doing this?

My validation_images is just

 validation_images = images[::6]

and the shape of images is (60000, 784). This is a numpy array.

The current method is not acceptable because it is too slow.

Note: [`in` for arrays makes no sense](http://stackoverflow.com/questions/18320624/how-does-contains-work-for-ndarrays), so your current code probably isn't doing what you want anyway. Also, how should this be affected by duplicates? Do you just want to drop every row whose index is a multiple of 6? — user2357112, Feb 09 '16 at 23:43
I didn't know that. But anyway, I want to do what I intended to do there. Yeah I don't care about duplicates, I just want to drop one row every 6 rows for my validation set. — ajfbiw.s, Feb 09 '16 at 23:46

MSeifert · Accepted Answer · 2016-02-10T00:15:18.480

5

I'm always using boolean masks for such things, you could consider:

# Mask every sixth row
mask = (np.arange(images.shape[0]) % 6) != 0

# Only use the not masked images
training_images = images[mask]

The validation set would then be every masked element:

validation_images = images[~mask]

Mathematical operations on numpy arrays work element wise, so taking the modulo (%) will be executed on each element and returns another array with the same shape. The != 0 works also element-wise and compares if the modulo is not zero. So the mask is just an array containing False where the value is not an int * 6 and True where it is.

edited Feb 10 '16 at 00:15

answered Feb 09 '16 at 23:54

MSeifert

145,886
38
333
352

np.arange(images.shape[0]) gives you an array, why are you able to do (THIS %6) != 0? What are you doing there and why does it work? Can you explain it? – ajfbiw.s Feb 10 '16 at 00:10
I've edited a small text at the end explaining the context of the operations. – MSeifert Feb 10 '16 at 00:18

score 0 · Answer 2 · edited Jun 10 '20 at 16:33

0

Z = np.linspace(0,1,12)[1:-1] 

#Create a vector of size 10 with values ranging from 0 to 1, both excluded

print(Z)

edited Jun 10 '20 at 16:33

Rafael Barros

2,738
1
21
28

answered Jun 10 '20 at 16:05

Pranjali Khandelwal

1
1

Numpy array exclude some elements

2 Answers2