Adding a simple attention layer to a custom resnet 18 architecture causes error in forward pass

Question

I am adding the following code in resnet18 custom code

self.layer1 = self._make_layer(block, 64, layers[0]) ## code existed before
self.layer2 = self._make_layer(block, 128, layers[1], stride=2) ## code existed before
self.layer_attend1 =  nn.Sequential(nn.Conv2d(layers[0], layers[0], stride=2, padding=1, kernel_size=3),
                                     nn.AdaptiveAvgPool2d(1),
                                     nn.Softmax(1)) ## code added by me

and also the following in its forward pass (def forward(self, x)) in the same resnet18 custom code:

x = self.layer1(x) ## the code existed before
x = self.layer_attend1(x)*x ## I added this code
x = self.layer2(x) ## the code existed before

and I get the following error. I had no error before adding this attention layer. Any idea how I could fix it?

=> loading checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar'
=> loaded checkpoint 'runs/nondisjoint_l2norm/model_best.pth.tar' (epoch 5)
/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /pytorch/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Traceback (most recent call last):
  File "main.py", line 352, in <module>
    main()    
  File "main.py", line 153, in main
    test_acc = test(test_loader, tnet)
  File "main.py", line 248, in test
    embeddings.append(tnet.embeddingnet(images).data)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/research/code/fashion/fashion-compatibility/type_specific_network.py", line 101, in forward
    embedded_x = self.embeddingnet(x)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/research/code/fashion/fashion-compatibility/Resnet_18.py", line 110, in forward
    x = self.layer_attend1(x)*x #so we don;t use name x1
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/container.py", line 139, in forward
    input = module(input)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 443, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/scratch3/venv/fashcomp/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 439, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [2, 2, 3, 3], expected input[256, 64, 28, 28] to have 2 channels, but got 64 channels instead

In VSCode even though I added a checkpoint before the problematic layer, it didn't even go to the checkpoint

score 2 · Accepted Answer · answered Sep 14 '21 at 01:58

The problem comes from a confusion: layers[0] is not the number of output channels, as you probably expected, but the number of blocks that layer will have. What you actually need is to use the 64, which is the number of output channels of the layer1 that precede your custom code:

self.layer_attend1 =  nn.Sequential(nn.Conv2d(64, 64, stride=2, padding=1, kernel_size=3),
                                    nn.AdaptiveAvgPool2d(1),
                                    nn.Softmax(1)) ## code added by me

Adding a simple attention layer to a custom resnet 18 architecture causes error in forward pass

1 Answers1