3

I am using Keras (version 2.0.0) and I'd like to make use of pretrained models like e.g. VGG16. In order to get started, I ran the example of the [Keras documentation site ][https://keras.io/applications/] for extracting features with VGG16:

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np

model = VGG16(weights='imagenet', include_top=False)

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)

The used preprocess_input() function bothers me (the function does Zero-centering by mean pixel what can be seen by looking at the source code).

Do I really have to preprocess input data (validation/test data) before using a trained model?

a) If yes, one can conclude that you always have to be aware of what preprocessing steps have been performed during training phase?!

b) If no: Does preprocessing of validation/test data cause a bias?

I appreciate your help.

D.Laupheimer
  • 1,074
  • 1
  • 9
  • 21

1 Answers1

1

Yes you should use the preprocessing step. You can retrain the model without it but the first layers will learn to center your datas so this is a waste of parameters.

If you do not recenter your performances will suffer.

Great thread on reddit : https://www.reddit.com/r/MachineLearning/comments/3q7pjc/why_is_removing_the_mean_pixel_value_from_each/

Dref360
  • 628
  • 5
  • 9
  • Thanks for your answer. For clarity: I have to use THE SAME preprocessing step(s) for validation/test data which I used for training data (for example: mean reduction of training data --> use the same mean value for mean reduction of validation/test data?). If the pretrained model performs on images with values [0,255], then I have do scale my images to that scope as well, right? – D.Laupheimer Mar 17 '17 at 07:34
  • I'm not sure if you got me right. I don't want to retrain the model. I want to fine-tune it. Meaning: The first layers will be non-trainable, only the last ones are trainable. So, what do you mean by "the first layers will learn to center your datas", as the first layers are non-trainable anyways? – D.Laupheimer Mar 17 '17 at 07:37
  • 1
    Yes. If you want to use an existing network you should use the exact same preprocessing that was used during training. Otherwise the performance will likely suffer since the distribution of your data will then differ from the distribution of the data used for training. – sietschie Mar 17 '17 at 13:55