Why is this XOR neural network having 2 outputs?

Question

Regularly, a simple neural network to solve XOR should have 2 inputs, 2 neurons in hidden layer, 1 neuron in output layer.

However, the following example implementation has 2 output neurons, and I don't get it:

https://github.com/deeplearning4j/dl4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/feedforward/xor/XorExample.java

Why did the author put 2 output neurons in there?

Edit: Author of the example noted that he is using 4 neurons in hidden layer, 2 neurons in output layer. But I still don't get it why, why a shape of {4,2} instead of {2,1}?

He explained it in the comments at the top. (it's another question how good this explanation is in regards to formal math) — sascha, May 11 '17 at 23:24
For all future questions, JFYI, there's an active dev community on the Gitter channel: https://gitter.im/deeplearning4j/deeplearning4j — racknuf, May 12 '17 at 23:55
yeah, that chat room is interesting, some guy helped me out how to match activation function with loss function — Dee, May 13 '17 at 00:10

score 3 · Accepted Answer · answered May 12 '17 at 06:09

3

This is called one hot encoding. The idea is that you have one neuron per class. Each neuron gives the probability of that class.

I don't know why he uses 4 hidden neurons. 2 should be enough (if I remember correctly).

answered May 12 '17 at 06:09

Martin Thoma

124,992
159
614
958

i find one-hot encoding good only when we don't have too many classes to classify – Dee May 12 '17 at 07:34
yeah, i dont know why he's using 4 hidden neurons too, i changed to 2 and it's still working perfectly! – Dee May 12 '17 at 07:38
coz there would be too many one-hot neurons in the output layer, i don't know but what if we need to classify that many – Dee May 12 '17 at 10:12
1

@johnlowvale I've never seen anything else. The largest number of classes I'm aware of is 1000 for ImageNet. One-hot encoding is no problem there. – Martin Thoma May 12 '17 at 10:36
It caused me some confusion too. The first two numbers are position in 4 by 2 table. The third number (off on its own), is the value to be place at that coordinate. – Adam Gerard Mar 30 '19 at 05:41

score 1 · Answer 2 · answered May 12 '17 at 05:26

1

The author uses the Evaluation class in the end (for stats of how often the network gives the correct result). This class needs one neuron per classification to work correctly, i.e. one output neuron for true and one for false.

answered May 12 '17 at 05:26

Shaido

27,497
23
70
73

1

i asked 1 guy on deeplearning4j chat room, he said it's because of the softmax activation function at output layer – Dee May 12 '17 at 07:35

Adam Gerard · Answer 3 · 2019-03-30T07:28:04.417

It might be helpful to think of it like this:

Training Set        Label Set

    0 | 1               0 | 1
0 | 0 | 0          0 |  0 | 1
1 | 1 | 0          1 |  1 | 0
2 | 0 | 1          2 |  1 | 0
3 | 1 | 1          3 |  0 | 1

So [[0,0], 0], [[0,1], 0], etc. for the Training Set.

If you're using the two column Label Set, 0 and 1 correspond to true or false.

Thus, [0,0] correctly maps to false, [1,0] correctly maps to true, etc.

A pretty good article that slightly modifies the original can be found here: https://medium.com/autonomous-agents/how-to-teach-logic-to-your-neuralnetworks-116215c71a49

Why is this XOR neural network having 2 outputs?

3 Answers3