This problem is more conceptual than in the code, so the fact that this is written in JS shouldn't matter very much.
So I'm trying to make a Neural Network and I'm testing it by trying to train it to do a simple task - an OR gate (or, really, just any logic gate). I'm using Gradient Descent without any batches for the sake of simplicity (batches seem unnecessary for this task, and the less unnecessary code I have the easier it is to debug).
However, after many iterations the output always converges to the average of the outputs. For example, given this training set:
[0,0] = 0
[0,1] = 1
[1,0] = 1
[1,1] = 0
The outputs, no matter the inputs, always converge around 0.5. If the training set is:
[0,0] = 0,
[0,1] = 1,
[1,0] = 1,
[1,1] = 1
The outputs always converge around 0.75 - the average of all the training outputs. This appears to be true for all combinations of outputs.
It seems like this is happening because whenever it's given something with an output of 0, it changes the weights to get closer to that, and whenever it's given something with an output of 1, it changes the weights to get closer to that, meaning that overtime it will converge around the average.
Here's the backpropagation code (written in Javascript):
this.backpropigate = function(data){
//Sets the inputs
for(var i = 0; i < this.layers[0].length; i ++){
if(i < data[0].length){
this.layers[0][i].output = data[0][i];
}
else{
this.layers[0][i].output = 0;
}
}
this.feedForward(); //Rerun through the NN with the new set outputs
for(var i = this.layers.length-1; i >= 1; i --){
for(var j = 0; j < this.layers[i].length; j ++){
var ref = this.layers[i][j];
//Calculate the gradients for each Neuron
if(i == this.layers.length-1){ //Output layer neurons
var error = ref.output - data[1][j]; //Error
ref.gradient = error * ref.output * (1 - ref.output);
}
else{ //Hidden layer neurons
var gradSum = 0; //Find sum from the next layer
for(var m = 0; m < this.layers[i+1].length; m ++){
var ref2 = this.layers[i+1][m];
gradSum += (ref2.gradient * ref2.weights[j]);
}
ref.gradient = gradSum * ref.output * (1-ref.output);
}
//Update each of the weights based off of the gradient
for(var m = 0; m < ref.weights.length; m ++){
//Find the corresponding neuron in the previous layer
var ref2 = this.layers[i-1][m];
ref.weights[m] -= LEARNING_RATE*ref2.output*ref.gradient;
}
}
}
this.feedForward();
};
Here, the NN is in a structure where each Neuron is an object with inputs, weights, and an output which is calculated based on the inputs/weights, and the Neurons are stored in a 2D 'layers' array where the x dimension is the layer (so, the first layer is the inputs, second hidden, etc.) and the y dimension is a list of the Neuron objects inside of that layer. The 'data' inputted is given in the form [data,correct-output]
so like [[0,1],[1]]
.
Also my LEARNING_RATE is 1 and my hidden layer has 2 Neurons.
I feel like there must be some conceptual issue with my backpropagation method, as I've tested out the other parts of my code (like the feedForward part) and it works fine. I tried to use various sources, though I mostly relied on the wikipedia article on backpropagation and the equations that it gave me.
.
.
I know it may be confusing to read my code, though I tried to make it as simple to understand as possible, but any help would be greatly appreciated.