1

Here are three kinds of pre-processing methods to convert from uint8 to float32. Which one is recommended to be used in conv2d->batch norm->ReLU structure(e.g. robustness and avoiding dying ReLU problem), or any suggestion?

  1. As mentioned in cs231n, zero-centered and normalized image using mean and std computed on training set. I think this method cost a lot when training set is huge.

  2. Like the codes in tensorflow models,

    image /= 255 image -= 0.5 image *= 2.0

  3. Simply divide image by 255

1 Answers1

2

Preprocessing comes in different flavors, and it usually depends on the framework you use. For example, Pytorch normalizes to [0,1], Tensorflow normalizes to [-1,1] and Keras leaves the range as [0,255]. I referenced this from the Keras preprocessing code. In my experience, normalizing does not make any difference for images, so just stick with the one used in your framework. However, if you have other data, like time series of measurements, etc. normalization can make the difference to successful training.

Subtracting the mean and dividing by the std is fairly common and does not need to be computationally expensive due to broadcasting. This has been shown to make a difference in terms of accuracy. However, I usually only use it for datasets with large image sizes like ImageNet.

chron0x
  • 875
  • 9
  • 19
  • Thank you for your answers. Can you explain the reason that you only subtract the mean and divide by the std for larger datasets? – Euphoria Yang Jul 31 '19 at 06:40
  • I did not see significant changes in the classification accuracy. To be more precise, with small datasets I referred to small image sizes as in MNIST or CIFAR. So I should have said small image sizes, instead. I will edit my answer. – chron0x Jul 31 '19 at 08:06