What is the desired behavior of average pooling with padding?

Question

Recently I've trained a neural network using pytorch and there is an average pooling layer with padding in it. And I'm confused about the behavior of it as well as the definition of average pooling with padding.

For example, if we have a input tensor:

[[1, 2, 3],
 [4, 5, 6],
 [7, 8, 9]]

When padding is one and kernel size 3, the input to the first kernel should be:

 0, 0, 0
 0, 1, 2
 0, 4, 5

The output from the pytorch is 12/4 = 3 (ignoring padded 0)， but I think it should be 12/9 = 1.333

Can anyone explain this to me?

Much appreciated.

score 2 · Accepted Answer · answered Apr 18 '19 at 05:05

2

It's basically up to you to decide how you want your padded pooling layer to behave.
This is why pytorch's avg pool (e.g., nn.AvgPool2d) has an optional parameter count_include_pad=True:
By default (True) Avg pool will first pad the input and then treat all elements the same. In this case the output of your example would indeed be 1.33.
On the other hand, if you set count_include_pad=False the pooling layer will ignore the padded elements and the result in your example would be 3.

answered Apr 18 '19 at 05:05

Shai

111,146
38
238
371

Thank you so much. So both 3 and 1.33 can be the right outcome depending on what we want? And about other platforms like Caffe, it seems that Caffe only support count_include_pad = False by looking at the source code? – Wenbin Xu Apr 18 '19 at 05:11

What is the desired behavior of average pooling with padding?

1 Answers1