0

Before training with a resnet50 model, I preprocessed my input using:

img = image.load_img(os.path.join(TRAIN, img), target_size=[224, 224])
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)

and save a numpy array of images. I found out that without preprocess_input, the size of the array is 1.5G, with preprocess_input, the size is 7G. Is that a normal behavior? Or am I missing something? Why does Zero-center by mean pixel drastically increase the input size?

This is how zero center by mean pixel is defined in keras

x = x[..., ::-1] x[..., 0] -= 103.939 x[..., 1] -= 116.779 x[..., 2] -= 123.68

ajp55
  • 81
  • 1
  • 11
  • How is `preprocess_input` defined? What do you mean by `Zero-center by mean pixel`? – rvinas May 07 '18 at 18:13
  • If you refer to keras implementation of ```preprocess_input``` you can see these lines of code ```x[..., 0] -= 103.939 x[..., 1] -= 116.779 x[..., 2] -= 123.68```. – ajp55 May 10 '18 at 13:17
  • I finally decided to use a generator. so i don't need to worry again about the size of the data. only a chunk of data was preprocessed – ajp55 May 17 '18 at 06:26

3 Answers3

3

It is because pixel values were of the type 'uint8', and now they are of type 'float'. So now you have an image, which is a 'float' array, which is larger than a 'uint8' array.

Rani
  • 46
  • 2
0

Reading from keras implementation of preprocess_input The images are normalized by subtracting the dataset's image mean which seems to be constants obtained from imagenet. Here the code

def _preprocess_numpy_input(x, data_format, mode):
if mode == 'tf':
    x /= 127.5
    x -= 1.
    return x

if data_format == 'channels_first':
    if x.ndim == 3:
        # 'RGB'->'BGR'
        x = x[::-1, ...]
        # Zero-center by mean pixel
        x[0, :, :] -= 103.939
        x[1, :, :] -= 116.779
        x[2, :, :] -= 123.68
    else:
        x = x[:, ::-1, ...]
        x[:, 0, :, :] -= 103.939
        x[:, 1, :, :] -= 116.779
        x[:, 2, :, :] -= 123.68
else:
    # 'RGB'->'BGR'
    x = x[..., ::-1]
    # Zero-center by mean pixel
    x[..., 0] -= 103.939
    x[..., 1] -= 116.779
    x[..., 2] -= 123.68
return x

I don't undersstant why using this piece of code increased the size of my dataset.

ajp55
  • 81
  • 1
  • 11
0

According to TensorFlow documentation argument is: A floating point numpy.array or a tf.Tensor, 3D or 4D with 3 color channels, with values in the range [0, 255]. and the function returns Returns: Preprocessed numpy.array or a tf.Tensor with type float32.

I have a feeling that the integers uses different amount of memory.