sigmoid() or tanh() activation function in linear system with neural network

Question

I am trying to build a Neural Network to study one problem with a continuous output variable. A schematic representation of the neural network used is described below

Schematic representation of neural network: input layer size = 1; hidden layer size = 8; output layer size = 1.

Is there any reason why I should use the tanh() activation function instead of the sigmoid() activation function in this case? I have been using in the past the sigmoid() activation function to solve logistic regression problems using neural networks, and it is not clear to me whether I should use the tanh() function when there is a continuous output variable.

Does it depend on the values of the continuous output variable? For example: (i) Use sigmoid() when the output variable is normalized from 0 to 1 (ii) Use tanh() when the output variable has negative values.

Thanks in advance

I think they are equivalent. If you use (1 + tanh())/2.0 it looks a lot like sigmoid. — duffymo, Nov 16 '16 at 12:46
A good explanation of the standard logistic function vs the hyperbolic tangent can be found here http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf — Bogdan B, Nov 16 '16 at 12:52
Thank you for the very useful and interesting reference bb01234. I think I found a good answer to my question in section 4.4 — chayanquete, Nov 16 '16 at 20:01

score 0 · Accepted Answer · answered Nov 16 '16 at 19:33

Except for the 0.5 bias, the two are almost identical functionally. The important parts are

gradient of roughly 1 in the "range of training interest" near 0;
gradient of roughly 0 for extreme values.

Once you've seen to those, I suspect that what you'll worry about more is the computational efficiency. tanh is expensive to compute on most architectures. If that's your worry, consider writing your own function, perhaps a look-up table with perhaps 2^10 pre-computed values for the range [-4,4], and "rail" values (-1 and 1) outside that range.

Thank you for your answer Prune. Your response and section 4.4 of the document suggested by bb01234 helped me a lot. Thanks — chayanquete, Nov 16 '16 at 20:03

sigmoid() or tanh() activation function in linear system with neural network

1 Answers1