I have more than a million images those I will like to use as training data. How do I make this data available freely without compromising security?
I want the users to be able to use it quickly for training purpose, without giving hackers a chance to rebuild images from the open source data. At the same time I do not want that the training quality will be affected in any way.
In other words how do I safely open-source images?
For e.g. This code generates numpy array. I just want to make it very difficult to reconstruct the original image from the ndarray "x" in this case.
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
i = load_img('some_image.jpg' )
x = img_to_array(i)
x = x.reshape((1,) + x.shape)
I can share the array x once I know that the hackers can not use the data and create the same image.