2

In a CNN, if the output is a one dimensional vector(say, a pre-logit layer), how would one reduce the dimensionality down to a specified size, using only convolutions?

How does one derive the filter dimensions/receptive field to accomplish such a task?

I am aware that this can be achieved by stacking a fully connected layer on the end of the network, but this does not seem so elegant.

Jack H
  • 2,440
  • 4
  • 40
  • 63

3 Answers3

0

Do you have the possibility to add a pooling layer after you convolution ? If yes, it's one of the main purpose of this layer, downsample a vector to a lower dimension one.

Otherwise, the number of applied filters is the dimensionality of your output space.

ML_TN
  • 727
  • 6
  • 16
0

What about a 1 dimensional convolution? You can use strides like in the following:

n,w = x.shape
c = 1
x = x.reshape(n,w,c) # 1d vector with one 1 channel
x = conv1d(x, 1, 3, stride=2, pad=1) # 1 filter so output size will be (n,w/2,c)
x = x.reshape(n,w//2)

That'll give you integer divisions of your current dimensionality. Or you could have a channel for each dimension for your output and then pool over the entire 1D region:

x = x.reshape(n,w,c)
x = conv1d(x, d, 3, pad=1) # d filters so output (n,w,d)
x = x.mean(1) # mean over 1d space so now (n,d)

No guarantees on whether any of these will actually work well, but this being a neural network they probably won't break anything too bad.

Finally, the cheat answer:

x = x.reshape(n,c,w) # (n,1,w)
x = conv1d(x, d, 1) # (n,1,d)
x = x.reshape(n,d)
gngdb
  • 484
  • 3
  • 8
0

Use the idea originally proposed in All Convolutional Net paper and later extensively used in Inception network, i.e. apply convolution for dimensionality reduction.

The trick is to perform convolution with a unit filter (1x1 for 2-D convolution, 1x1x1 for 3-D and so on) with a smaller number of filters. Nowadays, this trick is applied all the time to save computation in very deep convolutional networks, so you can use it before convolutional layers as well. In your question, the output tensor is one-dimensional (except batch size), so use 1-D convolution with 1 kernel size.

Here's the code in tensorflow, which reduces the tensor length from 64 to 32:

                              # `x` shape:  [batch, length] = [?, 64]
layer = tf.expand_dims(x, 2)  # reshape to: [batch, channels, 1] = [?, 64, 1]

output = tf.layers.conv1d(layer, filters=32, kernel_size=1,
                          strides=1, padding='valid',
                          data_format='channels_first')

                              # new shape:  [batch, filters, 1] = [?, 32, 1]
output = tf.squeeze(output)   # reshape to: [batch, length] = [?, 32]
Maxim
  • 52,561
  • 27
  • 155
  • 209
  • 1
    It may be worth noting that this 1x1 convolution over a spatial field of size 1 is exactly the same as using a fully connected layer; this being the basis of [Network in Network](https://arxiv.org/abs/1312.4400). – gngdb Nov 06 '17 at 14:45