0

I have been following a Caffe example here to plot the Convolution kernels from my ConvNet. I have attached an image below of my kernels, however it looks nothing like the kernels in the example. I have followed the example exactly, anyone know what the issue may be?

My net is trained on a set of simulated images (with two classes) and the performance of the net is pretty good, around 80% test accuracy.

enter image description here

layer {
  name: "input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mean_file: "/tmp/stage5/mean/mean.binaryproto"
  }
  data_param {
    source: "/tmp/stage5/train/train-lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "input"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mean_file: "/tmp/stage5/mean/mean.binaryproto"
  }
  data_param {
    source: "/tmp/stage5/validation/validation-lmdb"
    batch_size: 10
    backend: LMDB
  }
}
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1.0
  }
  param {
    lr_mult: 2.0
  }
  convolution_param {
    num_output: 40
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 3
    stride: 2
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "ip1"
  param {
    lr_mult: 1.0
  }
  param {
    lr_mult: 2.0
  }
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1.0
  }
  param {
    lr_mult: 2.0
  }
  inner_product_param {
    num_output: 2
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
mjacuse
  • 79
  • 1
  • 5
  • 1
    what weights were you using? was this net trained on natural images? what are the performance of this net? you need to provide more details. – Shai Oct 14 '15 at 12:02
  • I've updated the question with some more information, including the net itself. – mjacuse Oct 14 '15 at 12:31
  • 1
    what caffemodel file do you load before drawing the filters? – Shai Oct 14 '15 at 13:26
  • I train a model which saves after the final iteration. I then load it using 'caffe.classifier()' and follow the code in the caffe example linked to visualise the filters. – mjacuse Oct 14 '15 at 15:13
  • 1
    can you show some of your training examples? – Shai Oct 14 '15 at 15:21
  • First of all, its run on simulated images. Second, there are only two classes. Being just two classes and training from scratch, the model must have easily classified the images to two classes by a vague boundary between them. Thus the weights needn't be so clearly shaped like the one in the reference document, is my suspicion. @Shai seems to be an expert in deeplearning and so he might be having a better picture of what is happening. – Anoop K. Prabhu Oct 14 '15 at 19:19
  • 1
    @AnoopK.Prabhu thank you for the compliment – Shai Oct 14 '15 at 19:21
  • @mjacuse please update the question with (i) one or two examples of the input images you are using. (ii) a **short** code describing how you upload the net parameters and display the filters (it should be very small. one line `net = caffe.Net(...)` and another `filters=net.params['conv1'][0].data`, `vis_square(filters.transpose(0, 2, 3, 1))` – Shai Oct 15 '15 at 06:30

2 Answers2

0

Well, you might need to set the interpolation parameter to 'none' when you call imshow. Is that what you are referring to?

Diego
  • 605
  • 1
  • 4
  • 14
0

To get "smoother" filters you could try to add a small amount of L2 weight-decay (decay_mult) to the conv1 layer.

See also http://caffe.berkeleyvision.org/tutorial/layers.html

layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  # learning rate and decay multipliers for the filters
  param { lr_mult: 1 decay_mult: 1 }
  # learning rate and decay multipliers for the biases
  param { lr_mult: 2 decay_mult: 0 }
  convolution_param {
    num_output: 96     # learn 96 filters
    kernel_size: 11    # each filter is 11x11
    stride: 4          # step 4 pixels between each filter application
    weight_filler {
      type: "gaussian" # initialize the filters from a Gaussian
      std: 0.01        # distribution with stdev 0.01 (default mean: 0)
    }
    bias_filler {
      type: "constant" # initialize the biases to zero (0)
      value: 0
    }
  }
}
David Peer
  • 130
  • 2
  • 7