2

While reading the Tensorflow implmentation of VGG model, I noticed that author performs some scaling operation for the input RGB images, such as following. I have two questions: what does VGG_MEAN mean and how to get that setup? Secondly, why we need to subtract these mean values to get bgr

VGG_MEAN = [103.939, 116.779, 123.68]

ef build(self, rgb):
    """
    load variable from npy to build the VGG
    :param rgb: rgb image [batch, height, width, 3] values scaled [0, 1]
    """

    start_time = time.time()
    print("build model started")
    rgb_scaled = rgb * 255.0

    # Convert RGB to BGR
    red, green, blue = tf.split(3, 3, rgb_scaled)
    assert red.get_shape().as_list()[1:] == [224, 224, 1]
    assert green.get_shape().as_list()[1:] == [224, 224, 1]
    assert blue.get_shape().as_list()[1:] == [224, 224, 1]
    bgr = tf.concat(3, [
        blue - VGG_MEAN[0],
        green - VGG_MEAN[1],
        red - VGG_MEAN[2],
    ])
    assert bgr.get_shape().as_list()[1:] == [224, 224, 3]
stop-cran
  • 4,229
  • 2
  • 30
  • 47
user288609
  • 12,465
  • 26
  • 85
  • 127

3 Answers3

2

First off: the opencv code you'd use to convert RGB to BGR is:

from cv2 import cvtColor, COLOR_RGB2BGR
img = cvtColor(img, COLOR_RGB2BGR)

In your code, the code that does this is:

bgr = tf.concat(3, [
    blue - VGG_MEAN[0],
    green - VGG_MEAN[1],
    red - VGG_MEAN[2],
])

Images aren't [Height x Width] matrices, they're [H x W x C] cubes, where C is the color channel. In RGB to BGR, you're swapping the first and third channels.

Second: you don't subtract the mean to get BGR, you do this to normalize color channel values to center around the means -- so values will be in the range of, say, [-125, 130], rather than the range of [0, 255].

See: Subtract mean from image

I wrote a python script to get the BGR channel means over all images in a directory, which might be useful to you: https://github.com/ebigelow/save-deep/blob/master/get_mean.py

Community
  • 1
  • 1
ejb
  • 46
  • 5
0
  1. mean value is from computing the average of each layer in the training data.
  2. rgb -> bgr is for opencv issue.
Jonny
  • 1
  • Hi Jonny, thanks for the reply. But the original code, included in the original post, does not import opencv. – user288609 Aug 02 '16 at 16:29
0

The model is ported from Caffe, which I believe relies on OpenCV functionalities and uses the OpenCV convention of BGR channels.

HSU
  • 46
  • 1