My question is what is being normalized by BatchNormalization (BN).
I am asking, does BN normalize the channels for each pixel separately or for all the pixels together. And does it do it on a per image basis or on all the channels of the entire batch.
Specifically, BN is operating on X
. Say, X.shape = [m,h,w,c]
. So with axis=3
, it is operating on the "c" dimension which is the number of channels (for rgb) or the number of feature maps.
So lets say the X
is an rgb and thus has 3 channels. Does the BN do the following: (this is a simplified version of the BN to discuss the dimensional aspects. I understand that gamma and beta are learned but not concerned with that here.)
For each image=X
in m
:
- For each pixel (h,w) take the mean of the associated r, g, & b values.
- For each pixel (h,w) take the variance of the associated r, g, & b values
- Do
r = (r-mean)/var
,g = (g-mean)/var
, &b = (b-mean)/var
, where r, g, & b are the red, green, & blue channels ofX
respectively. - Then repeat this process for the next image in
m
,
In keras, the docs for BatchNormalization says:
axis: Integer, the axis that should be normalized (typically the features axis).
For instance, after a
Conv2D
layer withdata_format="channels_first"
, setaxis=1
inBatchNormalization
.
But what is it exactly doing along each dimension?