0

I'm working on implementing a back propagation algorithm. Initially I worked on training my network to solve XOR to verify that it works correctly before using it for my design. After reading this I decided to train it to solve AND gate first. I'm using sigmoid as transfer function and MSE to calculate the total error. I used different learning rates that ranged between 0.01 and 0.5. I trained the network several times each time for different iterations ranging from 100 iterations to 1000 iterations. The minimum total error I got was 0.08. Is this acceptable error?

My second question, should I use threshold instead of sigmoid to solve AND gate? if yes what is the suitable threshold?

Thirdly, should I set a limit on the initial weights for example betwen -1 and 1??

Thanks in advance.

EDIT 1

I think the output is weird here is the output after first iteration:

Target: 0.0 Output: 0.5314680723170211
Target: 0.0 Output: 0.7098671414869142
Target: 0.0 Output: 0.625565435381579
Target: 1.0 Output: 0.7827456263767251

and the output after the 400th iteration:

Target: 0.0 Output: 0.2826892072063843
Target: 0.0 Output: 0.4596476713717095
Target: 0.0 Output: 0.3675222634971935
Target: 1.0 Output: 0.5563197014845178

EDIT 2

Here is the part of my code that does the back propagation:

   for( int i=0;i< currentLayer.getSize();i++)
        {
                temp = currentLayer.getAt(i);
                err=temp.getOutput()*(1-temp.getOutput())*outErr[i];
                temp.setError(roundTwoDecimals(err));
        }

        for ( int i=0;i<currentLayer.getSize();i++)
        {
            temp = currentLayer.getAt(i); // get a neuron at the output layer
            // update the connections
                for (int j=0 ;j<temp.getInConnections().size();j++)
                {
                    inputCon= temp.getInputConnectionAt(j);

                    newW=inputCon.getWeight()+ inputCon.getDst().getError()*inputCon.getInput()*this.learningRate;

                    inputCon.setWeight(roundTwoDecimals(newW));
                }
                // now update the bias
                temp.setBias(temp.getBias()+(this.learningRate*temp.getError()));
        }
Community
  • 1
  • 1
Alaa
  • 539
  • 3
  • 8
  • 29

1 Answers1

2

0.08 is pretty low, but AND should be perfectly solvable, meaning an error of 0 should be possible. Your iterations and learning rates seem reasonable too. What is the topology of your network? Are you including a bias node?

Standard backpropagation algorithms don't usually play nicely with thresholds, which is the reason they aren't usually used. If you want to try it as a debugging test, you could use the Perceptron training rule and a threshold of .5 (which is pretty standard).

Yes, constraining initial weights to be between -1 and 1 is probably a good idea. For simple logic tasks, people usually don't allow weights to go outside of that range at all, although in principle I don't think it should be a problem.

seaotternerd
  • 6,298
  • 2
  • 47
  • 58
  • No, including bias is a good idea! I just wanted to make sure that you were. I agree that it's a bit strange (although not entirely implausible) that output for the last set of inputs is decreasing. Otherwise this all seems like roughly what you would expect - the output starts out pretty inaccurate but gradually goes in the correct direction. – seaotternerd Feb 24 '15 at 02:56
  • but for the last line of the output, the output must diverge to 1 but instead its decreasing! I have added the code responsible for doing the back propagation can you please see the edit? – Alaa Feb 24 '15 at 12:52
  • Over time, I definitely agree about the last output. But at the beginning of that example, there's a lot more pressure on the connection weights to get lowered than increased - three of the outputs are too high and only one is too low. So it's possible for performance on the last output to get worst at first. It should just eventually get better. I looked at your code. What is the outErr array, and how does it get filled in? Also, could you describe the shape of the neural net, just so can I make sure I'm picturing it right? I don't see anything obviously wrong with your code. – seaotternerd Feb 25 '15 at 16:36
  • outErr array is filled with (Target- output) for the output layer of the network. The neural network has 2 input neurons- 2 hidden neurons- and one output neuron. The bias is used as I said. I'm using this formula to update the bias, oldbias+ learninfrate*errorDerivative. is this correct' – Alaa Feb 26 '15 at 17:56