2

Given a tensor of size [8, 64, 128, 128] (B, CH, H, W), I would like to apply a channelwise 2D Max Pooling Operation over a 2x2x64 region (H, W, CH) with stride of 1, so as to obtain another tensor of size [8, 1, 128, 128]. Does the code below go onto the right direction?

import torch
import torch.nn as nn

torch.manual_seed(0)

B, CH, H, W = 8, 64, 128, 128
x_batch = torch.randn((B, CH, H, W))
max3d = nn.MaxPool3d((64,2,2), stride=1)
x_max = max3d(x_batch)
x_max.shape

In addition, the code above results in [8, 1, 127, 127], but I would like to exactly obtain a tensor of size [8, 1, 128, 128]. I was not able to find the proper padding yet, e.g. by using a padding=(0,1,1), I obtain an output of [8, 1, 129, 129]

Tin
  • 1,006
  • 1
  • 15
  • 27

1 Answers1

0

Because of your kernel size of 2 (which is asymmetrical), you need to apply asymmetrical padding, which is not innately supported in the MaxPoolXd functions. Therefore, you need to use the ZeroPad2d function which supports this operation:

import torch
a = torch.rand([8,64,128,128])
b = torch.nn.MaxPool3d((64,2,2),stride=1,padding=0)
c = torch.nn.ZeroPad2d((0,1,0,1)) # Left,Right,Top,Bottom padding

In [44]: b(c(a)).shape
Out[44]: torch.Size([8, 1, 128, 128])
jhso
  • 3,103
  • 1
  • 5
  • 13