Using Sigmoid instead of Tanh activation function fails - Neural Networks

Question

I'm looking at the following code from this blog

It gives the option to use both the sigmoid and the tanh activation function.

The XOR test seems to work fine with the tanh function yielding ~ (0,1,1,0)

But upon changing to sigmoid I get the wrong output ~ (0.5,0.5,0.5,0.5)

I've tried this with another piece of code I found online and the exact same problem occurs.

It seems the only thing changing is the activation function (and its derivative). Does changing this require other changes, say in backpropogation?

Thanks a lot for any help!

Have you tried to increase the size of the sidmoid-only network to verify that it is able to learn XOR? Try with something overkill like 2-10-1. The tanh output interval `[-1,1]` tend to fit XOR quicker in combination with a sigmoid output layer. Using sigmoid won't change the underlying backpropagation calculations. — jorgenkg, Sep 07 '16 at 06:14

Julien · Accepted Answer · 2016-09-08T01:58:18.503

0

Looks like the model you use doesn't train biases. The only difference between tanh and sigmoid is scaling and offset. Learning the new scaling will be done through the weights, but you'll also need to learn to compensate for the new offset, which should be done by learning the biases as well.

edited Sep 08 '16 at 01:58

answered Sep 07 '16 at 06:11

Julien

13,986
5
29
53

Thanks for explaining that. Is there an example of a program (ideally in python for readability) that you can refer me to which trains biases? – Greg Peckory Sep 07 '16 at 16:36
By the way, what if I use True=0.9 and False=0.1 for XOR? Do I still need biases? – Greg Peckory Sep 07 '16 at 16:39
1

Biases can be treated as weights for an extra component always equal to `1`: if `x = (x1, x2, ..., xn)` is the input vector to a neuron, and `w = (w1, w2, ..., wn)` are the weights that you learn, you can learn a bias with the exact same framework by augmenting your inputs as `x = (x0=1, x1, x2, ..., xn)` and learning the augmented weights `w = (w0, w1, w2, ..., wn)`, `w0` will be your bias. – Julien Sep 08 '16 at 03:05

Using Sigmoid instead of Tanh activation function fails - Neural Networks

1 Answers1