Normalization in the Object Detection API

Question

I'm quite confused of the normalization process when using the object detection api.

I'm using the SSD MobileNet v2 320x320 from the model zoo. In the pipeline config used for training I don't specify any additional preprocessing steps besides what is already defined my default.

The inference of the model from the checkpoint files works fine. However the tflite inference only seems to work if I add normalization before feeding the image to the net. I use the following line for this:

image = (image-127.5)/127.5

But I don't get why the preprocessing is helping when I didn't use it while training. The documentation also says that normalization only needs to be added for the inference when it was used for training.

What am I missing? Is there any preprocessing done in training by default that is not in the defined pipeline? If so I couldn't find it.

score 0 · Answer 1 · answered Jul 18 '22 at 09:51

I found it!

In the pipeline config a feature extractor is defined.

feature_extractor {
  type: "ssd_mobilenet_v2_keras"
  depth_multiplier: 1.0
  ...
  }

This uses this feature extractor. In this extractor the following preprocessing function is defined:

def preprocess(self, resized_inputs):
  """SSD preprocessing.
  Maps pixel values to the range [-1, 1].
  Args:
    resized_inputs: a [batch, height, width, channels] float tensor
      representing a batch of images.
  Returns:
    preprocessed_inputs: a [batch, height, width, channels] float tensor
    representing a batch of images.
  """
  return (2.0 / 255.0) * resized_inputs - 1.0

If you do the math you'll see that this is exactly the same as

image = (image-127.5)/127.5

just formatted in a different way.

Normalization in the Object Detection API

1 Answers1

Linked