Pytorch Batchnorm layer different from Keras Batchnorm

Question

I'm trying to copy pre-trained BN weights from a pytorch model to its equivalent Keras model but I keep getting different outputs.

I read Keras and Pytorch BN documentation and I think that the difference lies in the way they calculate the "mean" and "var".

Pytorch:

The mean and standard-deviation are calculated per-dimension over the mini-batches

source: Pytorch BatchNorm

Thus, they average over samples.

Keras:

axis: Integer, the axis that should be normalized (typically the features axis). For instance, after a Conv2D layer with data_format="channels_first", set axis=1 in BatchNormalization.

source: Keras BatchNorm

and here they average over the features (channels)

What's the right way? How to transfer BN weights between the models?

Well, you can average all PyTorch means to get an equivalent to the Keras value. For standard deviation it's not that easy (average of standard deviations is not necessarily equal to standard deviation of all channels), but the naive approach may be good enough. Does that work for you? — Jatentaki, Feb 12 '19 at 14:24
I reshaped Keras activations to (BCHW) from (BHWC) and it worked — Jenia Golbstein, Feb 14 '19 at 13:26
Could you post how you solved with a small code snippet? You did not transpose the tensor but swapped axis right? — JVGD, Apr 13 '20 at 15:03

score 0 · Answer 1 · answered May 07 '20 at 12:08

you can retrieve moving_mean and moving_variance from running_mean and running_var attributes of pytorch module

# torch weights, bias, running_mean, running_var corresponds to keras gamma, beta, moving mean, moving average

weights = torch_module.weight.numpy()  
bias = torch_module.bias.numpy()  
running_mean =  torch_module.running_mean.numpy()
running_var =  torch_module.running_var.numpy()

keras_module.set_weights([weights, bias, running_mean, running_var])

Pytorch Batchnorm layer different from Keras Batchnorm

1 Answers1

Linked