1

I have two PyTorch models that are equivalent (I think), the only difference between them is the padding:

import torch
import torch.nn as nn

i = torch.arange(9, dtype=torch.float).reshape(1,1,3,3)
# First model:
model1 = nn.Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), padding_mode='reflection')
# tensor([[[[-0.6095, -0.0321,  2.2022],
#           [ 0.1018,  1.7650,  5.5392],
#           [ 1.7988,  3.9165,  5.6506]]]], grad_fn=<MkldnnConvolutionBackward>)

# Second model:
model2 = nn.Sequential(nn.ReflectionPad2d((1, 1, 1, 1)),
             nn.Conv2d(1, 1, kernel_size=3))
# tensor([[[[1.4751, 1.5513, 2.6566],
#           [4.0281, 4.1043, 5.2096],
#           [2.6149, 2.6911, 3.7964]]]], grad_fn=<MkldnnConvolutionBackward>)

I was wondering why and when you use both approaches, the output of both is different but as I see it they should be the same, because the padding is of type reflection.

Would appreciate some help in understanding it.

EDIT

After what @Ash said, I wanted to check wheter or not the weights had influence so I pinned all of them to the same value and still there is a difference between the 2 methods:

import torch
import torch.nn as nn

i = torch.arange(9, dtype=torch.float).reshape(1,1,3,3)
# First model:
model1 = nn.Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), padding_mode='reflection')
model1.weight.data = torch.full(model1.weight.data.shape, 0.4)
print(model1(i))
print(model1.weight)
# tensor([[[[ 3.4411,  6.2411,  5.0412],
#           [ 8.6411, 14.6411, 11.0412],
#           [ 8.2411, 13.4411,  9.8412]]]], grad_fn=<MkldnnConvolutionBackward>)
# Parameter containing:
# tensor([[[[0.4000, 0.4000, 0.4000],
#           [0.4000, 0.4000, 0.4000],
#           [0.4000, 0.4000, 0.4000]]]], requires_grad=True)

# Second model:
model2 = [nn.ReflectionPad2d((1, 1, 1, 1)),
             nn.Conv2d(1, 1, kernel_size=3)]
model2[1].weight.data = torch.full(model2[1].weight.data.shape, 0.4)
model2 = nn.Sequential(*model2)
print(model2(i))
print(model2[1].weight)
# tensor([[[[ 9.8926, 11.0926, 12.2926],
#           [13.4926, 14.6926, 15.8926],
#           [17.0926, 18.2926, 19.4926]]]], grad_fn=<MkldnnConvolutionBackward>)
# Parameter containing:
# tensor([[[[0.4000, 0.4000, 0.4000],
#           [0.4000, 0.4000, 0.4000],
#           [0.4000, 0.4000, 0.4000]]]], requires_grad=True)
David
  • 8,113
  • 2
  • 17
  • 36

1 Answers1

1

the output of both is different but as I see it they should be the same

I don't think that the different outputs that you get are only related to how the reflective padding is implemented. In the code snippet that you provide, the values of the weights and biases of the convolutions from model1 and model2 differ, since they are initialized randomly and you don't seem to fix their values in the code.

EDIT:

Following your new edit, it seems that for versions prior to 1.5, looking at the implementation of the forward pass in <your_torch_install>/nn/modules/conv.pyshows that "reflection" is not supported. It wont complain about arbitrary strings instead of "reflection" either, but will default to zero-padding.

Ash
  • 4,611
  • 6
  • 27
  • 41
  • So basically you are saying that if I initiate the models weights to be the same the output will be the same? – David May 31 '20 at 09:36
  • 1
    @DavidS There could still be differences in how the two reflective paddings are implemented so I wont guarantee that this will solve the problem. However, different model parameters will necessarily mean different results, so I think we need to eliminate that source of error first. – Ash May 31 '20 at 09:38
  • you are correct, I tried it but it still different, see the edit – David May 31 '20 at 09:49
  • I see, so you think it’s a bug, and worth reporting. +1 by the way very impressive research – David May 31 '20 at 10:31
  • @DavidS It seems that it was just a missing features, even the documentation of older verisons (I'm using 1.4 right now) doesn't mention symmetric padding. – Ash May 31 '20 at 10:38