Can someone tell me what is wrong with this back propagation implementation

Question

So I am trying to implement a backpropagation neural network in c#. And I've come across a hiccup. When training the network, all the outputs are either 0.49???... or 0.51???...

Here's my network class

namespace BackPropNetwork
{
public class Network 
{
    public double[][] Values { get; set; }
    public double[][] Deltas { get; set; }
    public double[][][] Weights { get; set; }
    public Network(params int[] size)
    {
        Values = new double[size.Length][];
        Weights = new double[size.Length][][];
        Deltas = new double[size.Length][];
        Random r = new Random();
        for(int i = 0; i < size.Length; i++)
        {
            Values[i] = new double[size[i]];
            Weights[i] = new double[size[i]][];
            Deltas[i] = new double[size[i]];
            if (i != size.Length - 1) {
                for (int j = 0; j < size[i]; j++)
                {
                    Weights[i][j] = new double[size[i + 1]];
                    for(int k= 0; k < size[i + 1]; k++)
                    {
                        Weights[i][j][k] = r.NextDouble() ;
                    }
                }
            }
        }
    }
    public double[] FeedThrough (double[] input)
    {
        if(input.Length!= Values[0].Length)
        {
            throw new InvalidOperationException();
        }
        Values[0] = input;
        for(int i = 0; i < Values.Length-1; i++)
        {
            for(int j = 0; j < Values[i + 1].Length; j++)
            {
                Values[i + 1][j] = Sigmoid(GetPassValue(i, j),false);
            }
        }
        return Values[Values.Length - 1];
    }
    double GetPassValue(int layer,int neuron)
    {
        double sum = 0;
        for(int i = 0; i < Values[layer].Length; i++)
        {
            sum += Values[layer][i] * Weights[layer][i][neuron];
        }
        return sum;
    }
    public double Sigmoid(double d, bool dir)
    {
        if (dir)
        {
            return d * (1 - d);
        }else
        {
            return 1 / (1 + Math.Exp(d));
        }
    }
    public void CorrectError(double[] error)
    {
        for(int i = Values.Length - 1; i >= 0; i--)
        {

            if (i !=Values.Length - 1)
            {
                error = new double[Values[i].Length];
                for(int j = 0; j < Values[i].Length; j++)
                {
                    error[j] = 0;
                    for(int k = 0; k < Values[i + 1].Length; k++)
                    {
                        error[j] += Weights[i][j][k] * Deltas[i + 1][k];
                    }
                }    
            }

            for(int j = 0; j < Values[i].Length; j++)
            {
                Deltas[i][j] = error[j] * Sigmoid(Values[i][j],true);
            }

        }

    }
    public void ApplyCorrection(double rate)
    {
        for(int i = 0; i < Values.Length-1; i++)
        {
            for(int j = 0; j < Values[i].Length; j++)
            {
                for(int k = 0; k < Values[i + 1].Length; k++)
                {
                    Weights[i][j][k] = rate * Deltas[i + 1][k] * Values[i][j];
                }
            }
        }
    }
}

}

and here's my tester class:

namespace BackPropagationTest
{
class Program
{
    static void Main(string[] args)
    {
        Network n = new Network(3, 5, 5, 1);
        double[][] input = new double[][] { new double[] { 1, 0, 1 }, new double[] { 1, 1, 1 }, new double[] { 0, 0, 0 }, new double[] {0, 1, 0 } };
        double[][] output = new double[][] { new double[] { 0 },new double[] { 1 }, new double[] { 0 }, new double[] { 0 } };
        for (int i = 0; i < 10; i++)
        {
            for(int j = 0; j < input.Length; j++)
            {
                var x = n.FeedThrough(input[j]);
                double[] error = new double[output[0].Length];
                for(int k= 0; k < x.Length; k++)
                {
                    error[k] = output[j][k] - x[k];
                }
                n.CorrectError(error);
                n.ApplyCorrection(0.01);
                for(int k = 0; k < x.Length; k++)
                {
                     Console.Write($"Expected: {output[j][k]} Got: {x[k]} ");
                }
                Console.WriteLine();


            }
            Console.WriteLine();

        }
    }
}

}

and here's my output:

Expected: 0 Got: 0.270673949003643
Expected: 1 Got: 0.500116517554687
Expected: 0 Got: 0.499609458404919
Expected: 0 Got: 0.50039031963377

Expected: 0 Got: 0.500390929619276
Expected: 1 Got: 0.500390929999612
Expected: 0 Got: 0.499609680732027
Expected: 0 Got: 0.500390319841144

Expected: 0 Got: 0.50039092961941
Expected: 1 Got: 0.500390929999612
Expected: 0 Got: 0.499609680732027
Expected: 0 Got: 0.500390319841144

Expected: 0 Got: 0.50039092961941
Expected: 1 Got: 0.500390929999612
Expected: 0 Got: 0.499609680732027
Expected: 0 Got: 0.500390319841144

And it goes on like this forever.

Edit 1:

I have made a change in the ApplyCorrection() function where I have replaced

 Weights[i][j][k] = rate * Deltas[i + 1][k] * Values[i][j];

with `

 Weights[i][j][k] += rate * Deltas[i + 1][k] * Values[i][j];

and now the weights seem to update. but I still question the correctness of this implementation. A.k.a still need help :)

Edit 2:

I was not summing the total error of the output layer rather backpropagating every sample error individually. Now I am, but the outputs are very confusing:

And I also changed the output pair from (0,1) to (-1, 1) in attempts to make the calculated error values greater. This is after 1000000 epochs at a learning rate of 0.1:

Expected: -1 Got: 0.999998429209274 Expected: 1 Got: 0.999997843901661 Expected: -1 Got: 0.687098308461306 Expected: -1 Got: 0.788960893508226 Expected: -1 Got: 0.999998429209274 Expected: -1 Got: 0.863022549216158 Expected: -1 Got: 0.788960893508226 Expected: -1 Got: 0.999998474717769

I actually though it could be because I'm setting the weights in the ApplyCorrection function. where it is Weights[i][j][k] = rate * Deltas[i + 1][k] * Values[i][j]; I think it should be Weights[i][j][k] += rate * Deltas[i + 1][k] * Values[i][j]; — Paso, Apr 12 '17 at 05:42
That seemed to make the outputs change, and made the network learn. but I'm still not sure of the correctness of my implementation. — Paso, Apr 12 '17 at 05:43

Mukul Varshney · Accepted Answer · 2017-04-12T08:28:51.433

0

Try playing with something like below and check the error is decreasing or still the same.

public double Sigmoid(double d, bool dir)
{
    if (dir)
    {
        return d * (1 - d);
    }else
    {
        if (d < -45.0) return 0.0;
        else if (d > 45.0) return 1.0;
        else return 1.0 / (1.0 + Math.Exp(-d));
    }
}

edited Apr 12 '17 at 08:28

answered Apr 12 '17 at 07:00

Mukul Varshney

3,131
1
12
19

If it's true it returns the derivative. And no I'm not. but thanks anyways – Paso Apr 12 '17 at 07:24
Ok I tried it and what's happening is the weights just stop changing – Paso Apr 12 '17 at 13:45
Oh nevermind it actually worked. thank you very much! – Paso Apr 12 '17 at 13:51
for actual output -1, was it giving -1 value as predicted output or still lies between 0 and 1 ? I would first play around with sigmoid function than try any alternate function for -1 to 1. Refer link "ftp://ftp.icsi.berkeley.edu/pub/ai/jagota/vol2_6.pdf". Other options would be assign next set of random values to start the network. – Mukul Varshney Apr 12 '17 at 13:52

Can someone tell me what is wrong with this back propagation implementation

1 Answers1