regarding the image scaling operations for running vgg model

Question

While reading the Tensorflow implmentation of VGG model, I noticed that author performs some scaling operation for the input RGB images, such as following. I have two questions: what does VGG_MEAN mean and how to get that setup? Secondly, why we need to subtract these mean values to get bgr

VGG_MEAN = [103.939, 116.779, 123.68]

ef build(self, rgb):
    """
    load variable from npy to build the VGG
    :param rgb: rgb image [batch, height, width, 3] values scaled [0, 1]
    """

    start_time = time.time()
    print("build model started")
    rgb_scaled = rgb * 255.0

    # Convert RGB to BGR
    red, green, blue = tf.split(3, 3, rgb_scaled)
    assert red.get_shape().as_list()[1:] == [224, 224, 1]
    assert green.get_shape().as_list()[1:] == [224, 224, 1]
    assert blue.get_shape().as_list()[1:] == [224, 224, 1]
    bgr = tf.concat(3, [
        blue - VGG_MEAN[0],
        green - VGG_MEAN[1],
        red - VGG_MEAN[2],
    ])
    assert bgr.get_shape().as_list()[1:] == [224, 224, 3]

score 2 · Answer 1 · edited May 23 '17 at 12:08

First off: the opencv code you'd use to convert RGB to BGR is:

from cv2 import cvtColor, COLOR_RGB2BGR
img = cvtColor(img, COLOR_RGB2BGR)

In your code, the code that does this is:

bgr = tf.concat(3, [
    blue - VGG_MEAN[0],
    green - VGG_MEAN[1],
    red - VGG_MEAN[2],
])

Images aren't [Height x Width] matrices, they're [H x W x C] cubes, where C is the color channel. In RGB to BGR, you're swapping the first and third channels.

Second: you don't subtract the mean to get BGR, you do this to normalize color channel values to center around the means -- so values will be in the range of, say, [-125, 130], rather than the range of [0, 255].

See: Subtract mean from image

I wrote a python script to get the BGR channel means over all images in a directory, which might be useful to you: https://github.com/ebigelow/save-deep/blob/master/get_mean.py

score 0 · Answer 2 · answered Aug 02 '16 at 16:25

0

mean value is from computing the average of each layer in the training data.
rgb -> bgr is for opencv issue.

answered Aug 02 '16 at 16:25

Jonny

1

Hi Jonny, thanks for the reply. But the original code, included in the original post, does not import opencv. – user288609 Aug 02 '16 at 16:29

score 0 · Answer 3 · answered Aug 17 '16 at 18:03

0

The model is ported from Caffe, which I believe relies on OpenCV functionalities and uses the OpenCV convention of BGR channels.

answered Aug 17 '16 at 18:03

HSU

46
1

regarding the image scaling operations for running vgg model

3 Answers3