4

I am new to Caffe, and now I need to modify the threshold values in ReLU layers in a convolution neural network. The way I am using now to modify thresholds is to edit the C++ source code in caffe/src/caffe/layers/relu_layer.cpp, and recompile it. However, this will change the threshold value to a specified value every time when ReLU is called. Is there any way to use different value as threshold in each ReLU layers in a network? By the way, I am using the pycaffe interface and I cannot find such a way to do so.

Finally, sorry for my poor English, if there are something unclear, just let me know, I'll try to describe it in detail.

Shai
  • 111,146
  • 38
  • 238
  • 371
zbqv
  • 79
  • 8
  • Dale's answer is fine, but Shai's answer should be chosen as the correct one. You should avoid modifying Caffe when it is not needed. – Jonathan Nov 14 '16 at 15:55

2 Answers2

3

Yes, you can. In src/caffe/proto, add a line:

message ReLUParameter {
  ...
  optional float threshold = 3 [default = 0]; #add this line
  ... 
}

and in src/caffe/layers/relu_layer.cpp, make some small modifications as:

template <typename Dtype>
void ReLULayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
    const vector<Blob<Dtype>*>& top) {
  ...
  Dtype threshold = this->layer_param_.relu_param().threshold(); //add this line
  for (int i = 0; i < count; ++i) {
    top_data[i] = (bottom_data[i] > threshold) ? (bottom_data[i] - threshold) : 
                  (negative_slope * (bottom_data[i] - threshold));
  }
}

template <typename Dtype>
void ReLULayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top,
    const vector<bool>& propagate_down,
    const vector<Blob<Dtype>*>& bottom) {
  if (propagate_down[0]) {
    ...
    Dtype threshold = this->layer_param_.relu_param().threshold(); //this line
    for (int i = 0; i < count; ++i) {
      bottom_diff[i] = top_diff[i] * ((bottom_data[i] > threshold)
          + negative_slope * (bottom_data[i] <= threshold));
    }
  }
}

and similarly in src/caffe/layers/relu_layer.cu the code should be like this.

And after compiling your caffe and pycaffe, in your net.prototxt, you can write a relu layer like:

layer {
  name: "threshold_relu"
  type: "ReLU"
  relu_param: {threshold: 1 #e.g. you want this relu layer to have a threshold 1}
  bottom: "input"
  top: "output"
}
Dale
  • 1,608
  • 1
  • 9
  • 26
  • What is `threshold = 3` mean? Why is __3__? – zbqv Nov 12 '16 at 17:22
  • I've got what `threshold = 3` mean. And my `net.prototxt` works when I add `relu_param { threshold: 1 }` instead of `threshold: 1`. If I use the `threshold: 1`, I got an error like `Message type "caffe.LayerParameter" has no field named "threshold".` – zbqv Nov 12 '16 at 18:40
  • @zbqv Sorry, my carelessness. And I've corrected my answer. – Dale Nov 13 '16 at 00:11
  • in your forward pass the second argument for minimum. should be threshold and not zero. What is the meaning of threshold here? I think you need to shift the function... – Shai Nov 14 '16 at 09:56
  • @Shai Sorry, my bad and the code was wrong, I've corrected it. Thank you for correcting me. – Dale Nov 14 '16 at 11:52
  • you still have a discontinuity at `x=threshold`. What is the meaning of ReLU with threshold other than zero? is it `f(x) = x-threshold if x>threshold, 0 otherwise` or `f(x) = x if x>threshold, thershold otherwise`? – Shai Nov 14 '16 at 11:58
  • 1
    @Shai My bad. As what I read from the question, it should be the first case you mentioned. I've corrected it, thanks again! – Dale Nov 14 '16 at 12:13
3

If I understand correctly your "ReLU with threshold" is basically

f(x) = x-threshold if x>threshold, 0 otherwise

You can easily implement it by adding a "Bias" layer which subtract threshold from the input just prior to a regular "ReLU" layer

Shai
  • 111,146
  • 38
  • 238
  • 371