6
training_images = np.array([i for i in images if i not in validation_images])

The above is wrong (as noted in a comment below). What's a correct and faster way of doing this?

My validation_images is just

 validation_images = images[::6]

and the shape of images is (60000, 784). This is a numpy array.

The current method is not acceptable because it is too slow.

ajfbiw.s
  • 401
  • 1
  • 8
  • 22
  • Note: [`in` for arrays makes no sense](http://stackoverflow.com/questions/18320624/how-does-contains-work-for-ndarrays), so your current code probably isn't doing what you want anyway. Also, how should this be affected by duplicates? Do you just want to drop every row whose index is a multiple of 6? – user2357112 Feb 09 '16 at 23:43
  • I didn't know that. But anyway, I want to do what I intended to do there. Yeah I don't care about duplicates, I just want to drop one row every 6 rows for my validation set. – ajfbiw.s Feb 09 '16 at 23:46

2 Answers2

5

I'm always using boolean masks for such things, you could consider:

# Mask every sixth row
mask = (np.arange(images.shape[0]) % 6) != 0

# Only use the not masked images
training_images = images[mask]

The validation set would then be every masked element:

validation_images = images[~mask]

Mathematical operations on numpy arrays work element wise, so taking the modulo (%) will be executed on each element and returns another array with the same shape. The != 0 works also element-wise and compares if the modulo is not zero. So the mask is just an array containing False where the value is not an int * 6 and True where it is.

MSeifert
  • 145,886
  • 38
  • 333
  • 352
  • np.arange(images.shape[0]) gives you an array, why are you able to do (THIS %6) != 0? What are you doing there and why does it work? Can you explain it? – ajfbiw.s Feb 10 '16 at 00:10
  • I've edited a small text at the end explaining the context of the operations. – MSeifert Feb 10 '16 at 00:18
0
Z = np.linspace(0,1,12)[1:-1] 

#Create a vector of size 10 with values ranging from 0 to 1, both excluded

print(Z)
Rafael Barros
  • 2,738
  • 1
  • 21
  • 28