What does global pooling do?

Question

I recently found the "global_pooling" flag in the Pooling layer in caffe, however was unable to find sth about it in the documentation here (Layer Catalogue) nor here (Pooling doxygen doc) .

Is there an easy forward examply explanation to this in comparison to the normal Pool-Layer behaviour?

Thomas Pinetz · Accepted Answer · 2017-02-06T16:22:12.210

23

With Global pooling reduces the dimensionality from 3D to 1D. Therefore Global pooling outputs 1 response for every feature map. This can be the maximum or the average or whatever other pooling operation you use.

It is often used at the end of the backend of a convolutional neural network to get a shape that works with dense layers. Therefore no flatten has to be applied.

edited Feb 06 '17 at 16:22

answered Feb 06 '17 at 15:25

Thomas Pinetz

6,948
2
27
46

Thanks for your answer, but shouldn't it be 3D to 1D , hence it is only 1 output ? Since you only have 3D output between the layers (at least in caffe) – Kev1n91 Feb 06 '17 at 15:44
The 4th / 2th output is the number of samples. – Thomas Pinetz Feb 06 '17 at 15:46
Let's say I have a convolution which has the following output dimensions: DxWxH : (250, 15, 15), then I am not quite sure what the fourth dimension should be ,sry – Kev1n91 Feb 06 '17 at 16:11
In tensorflow its organized like this (nrSamples, channel, width, height). Therefore width and depth gets reduced. In caffe it seems to be that there is no samples dimensions so it will be reduced from 3D to 1D. – Thomas Pinetz Feb 06 '17 at 16:16
Ah, the question is directly related to caffe, so maybe you can change your answer so it fits the quetion – Kev1n91 Feb 06 '17 at 16:19
Ok sry changed the answer. – Thomas Pinetz Feb 06 '17 at 16:22

Martin Thoma · Answer 2 · 2017-02-07T12:57:07.963

Convolutions can work on any image input size (which is big enough). However, if you have a fully connected layer at the end, this layer needs a fixed input size. Hence the complete network needs a fixed image input size.

However, you can remove the fully connected layer and just work with convolutional layers. You can make a convolutional layer at the end which has the same number of filters as you have classes. But you want one value for each class which indicates the probability of that class. Hence you apply a pooling filter over the complete remaining feature map. This pooling is hence "global" as it always is as big as necessary. In contrast, usual pooling layers have a fixed size (e.g. of 2x2 or 3x3).

This is a general concept. You can also find global pooling in other libraries, e.g. Lasagne. If you want a good reference in literature, I recommend reading Network In Network.

score 2 · Answer 3 · answered Jul 11 '18 at 03:52

We get only one value from entire feature map when we apply GP layer, in which kernel size is the h×w of the feature map. GP layers are used to reduce the spatial dimensions of a three-dimensional feature map. However, GP layers perform a more extreme type of dimensionality reduction, where a feature map with dimensions h×w×d is reduced in size to have dimensions 1×1×d. GP layers reduce each h×w feature map to a single number by simply taking the average of all hw values.

score 1 · Answer 4 · answered Feb 06 '17 at 15:36

If you are looking for information regarding flags/parameters of caffe, it is best look them up in the comments of '$CAFFE_ROOT/src/caffe/proto/caffe.proto'.
For 'global_pooling' parameter the comment says:

// If global_pooling then it will pool over the size of the bottom by doing
// kernel_h = bottom->height and kernel_w = bottom->width

For more information about caffe layers, see this help pages.

What does global pooling do?

4 Answers4