I would like to correctly pre-process images to input them into the VGG16 model
In their original paper the authors write:
During training, the input to our ConvNets is a fixed-size 224 × 224 RGB image. The only preprocessing we do is subtracting the mean RGB value, computed on the training set, from each pixel.
The resizing part is easily done:
import cv2
import numpy as np
# Reading the image in RGB mode
image = cv2.imread(PATH_TO_IMAGE,1)
# Resize Image to original VGG16 input size
# from the paper: "During training, the input to our ConvNets
# is a fixed-size 224 × 224 RGB image"
width = 224
height = 224
dim = (width, height)
# resize image
resized_image = cv2.resize(image, dim, interpolation = cv2.INTER_AREA)
... but I am not so sure about the subtracting the mean RGB value:
meanRBB_substract_image = resized_image - np.mean(resized_image)
Is that the correct way to do it?
Before mean RGB substraction:
After mean RGB substraction:
More about the VGG16 model: https://neurohive.io/en/popular-networks/vgg16/#:~:text=The%20architecture%20depicted%20below%20is%20VGG16.&text=The%20input%20to%20cov1%20layer,stack%20of%20convolutional%20(conv.)
EDIT: I just realized that they write "computed on the training set" -> Does this mean that I need to 1. Find the mean RGB value for all pictures in my training set, and then 2. subtract this mean from all training set images?