1

I am trying different activation functions in my simple neural network.

It does not matter using tf.nn.relu, tf.nn.sigmoid,... the network does what it should do.

But if I am using tf.nn.crelu, I have a dimension error.

It returns something like [max, min] and the width dimension is twice bigger. What do I have to do? Fitting the following weights and biases to the output of crelu?

Maxim
  • 52,561
  • 27
  • 155
  • 209
j35t3r
  • 1,254
  • 2
  • 19
  • 53
  • CReLu Concatenates a ReLU which selects only the positive part of the activation with a ReLU which selects only the negative part of the activation. Note that as a result this non-linearity doubles the depth of the activations. You can read more here. https://www.tensorflow.org/api_docs/python/tf/nn/crelu – codeslord Dec 09 '17 at 06:16
  • yes, this it what it does. but how to use it? – j35t3r Dec 09 '17 at 11:17
  • Can you check the below link, it contains an example implementation(uses crelu from chainer but it is very similar/same as tensorflow) https://programtalk.com/vs2/python/10099/chainer/tests/chainer_tests/functions_tests/activation_tests/test_crelu.py/ – codeslord Dec 10 '17 at 14:11

1 Answers1

1

You're right, if you're building the network manually, you need to adjust the dimensions of the following layer to match tf.nn.crelu output. In this sense, tf.nn.crelu is not interchangeable with tf.nn.relu, tf.nn.elu, etc.

The situation is simpler if you use a high-level API, e.g. tensorflow slim. In this case, the layer functions are taking care of matching dimensions, so you can replace tf.nn.relu easily with tf.nn.crelu in code. However, keep in mind that the network is silently becoming twice as big.

Here's an example:

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                    activation_fn=tf.nn.crelu,
                    normalizer_fn=slim.batch_norm,
                    normalizer_params={'is_training': is_training, 'decay': 0.95}):
    conv1 = slim.conv2d(x_image, 16, [5, 5], scope='conv1')
    pool1 = slim.max_pool2d(conv1, [2, 2], scope='pool1')
    conv2 = slim.conv2d(pool1, 32, [5, 5], scope='conv2')
    pool2 = slim.max_pool2d(conv2, [2, 2], scope='pool2')
    flatten = slim.flatten(pool2)
    fc = slim.fully_connected(flatten, 1024, scope='fc1')
    drop = slim.dropout(fc, keep_prob=keep_prob)
    logits = slim.fully_connected(drop, 10, activation_fn=None, scope='logits')

slim.arg_scope simply applies all provided arguments to the underlying layers, in particular activation_fn. Also note activation_fn=None in the last layer to fix the output dimension. Complete code can be found here.

Maxim
  • 52,561
  • 27
  • 155
  • 209