0

So basically this are the dimensions of the weights from trained caffenet:

conv1: (96,3,11,11) conv2: (256,48,5,5) conv3:(384,256,3,3) conv4: (384,192,3,3) conv5:(256, 192, 3 , 3)

I am confused that although conv1 gives 96 channels as output why does conv2 only considers 48 while convolution? Am I missing something?

sunjeet95
  • 13
  • 6

1 Answers1

0

Yes, you missed the parameter 'group'. The convolution_param defined in the conv2 layer is given below.You can find out that parameter group is set to 2 as grouping the convolution layer can save gpu memory.

 convolution_param {
 num_output: 256
 pad: 2
 kernel_size: 5
 group: 2
 weight_filler {
  type: "gaussian"
  std: 0.01
 }
 bias_filler {
  type: "constant"
  value: 1
 }
Qinghao.Hu
  • 101
  • 5
  • Could please help me understand what parameter group means? Does it means that the weights used for the first 48 channel and the second 48 channels is same? Actually I want to use the weights for coding them in tensorflow. – sunjeet95 Mar 02 '18 at 06:58
  • The parameter group separates convolution filters in terms of input channel. Originally, a 5*5 filter connects to all the 96 input channels, for group=2, a 5*5 filter in the first group only connects to the first 48 input channels and a 3*3 filter in the second filter only connects to the second 48 input channels. Thus In conv2 layer, 128 filters connects to the first 48 input channels and 128 filters connects to the rest 48 input channels. Please see paper: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf – Qinghao.Hu Mar 02 '18 at 07:12
  • The weights used for the first 48 channel and the second 48 channels is NOT same. Filters in two group are different but has the same shape (128,48,5,5), thus the total shape for conv2 is (256,48,5,5). – Qinghao.Hu Mar 02 '18 at 07:20
  • So how do I use them for re building the code in tensorflow as when I extracted the weights from the official caffenet weight model, the weight size for conv2 is(256,48,5,5). I want a single piped architecture. – sunjeet95 Mar 02 '18 at 09:14
  • If you want to convert a caffemodel to tensorflow you can refer to the link https://github.com/ethereon/caffe-tensorflow or use the MMdnn tool https://github.com/Microsoft/MMdnn . If you want to write code in tensorflow using extracted weights, you can refer to https://github.com/hjptriplebee/AlexNet_with_tensorflow/blob/master/alexnet.py and then fill the layer with extracted weights. – Qinghao.Hu Mar 02 '18 at 11:43