3

I would like to correctly pre-process images to input them into the VGG16 model

In their original paper the authors write:

During training, the input to our ConvNets is a fixed-size 224 × 224 RGB image. The only preprocessing we do is subtracting the mean RGB value, computed on the training set, from each pixel.

The resizing part is easily done:

import cv2
import numpy as np


# Reading the image in RGB mode
image = cv2.imread(PATH_TO_IMAGE,1)

# Resize Image to original VGG16 input size
# from the paper: "During training, the input to our ConvNets 
# is a fixed-size 224 × 224 RGB image"

width = 224
height = 224
dim = (width, height)

# resize image
resized_image = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)

... but I am not so sure about the subtracting the mean RGB value:

meanRBB_substract_image = resized_image - np.mean(resized_image)

Is that the correct way to do it?

Before mean RGB substraction:

enter image description here

After mean RGB substraction:

enter image description here

More about the VGG16 model: https://neurohive.io/en/popular-networks/vgg16/#:~:text=The%20architecture%20depicted%20below%20is%20VGG16.&text=The%20input%20to%20cov1%20layer,stack%20of%20convolutional%20(conv.)

EDIT: I just realized that they write "computed on the training set" -> Does this mean that I need to 1. Find the mean RGB value for all pictures in my training set, and then 2. subtract this mean from all training set images?

henry
  • 875
  • 1
  • 18
  • 48

1 Answers1

1

Try:

from keras.applications.vgg16 import preprocess_input
...
resized_image = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)
processedimage = preprocess_input(resized_image)

From: https://www.pyimagesearch.com/2016/08/10/imagenet-classification-with-python-and-keras/

Stackaccount1
  • 139
  • 1
  • 12