Perceptron learns to reproduce just one pattern all the time

Question

This is rather a weird problem.

A have a code of back propagation which works perfectly, like this:

ANN BP

Now, when I do batch learning I get wrong results even if it concerns just a simple scalar function approximation.

After training the network produces almost the same output for all input patterns.

By this moment I've tried:

Introduced bias weights
Tried with and without updating of input weights
Shuffled the patterns in batch learning
Tried to update after each pattern and accumulating
Initialized weights in different possible ways
Double-checked the code 10 times
Normalized accumulated updates by the number of patterns
Tried different layer, neuron numbers
Tried different activation functions
Tried different learning rates
Tried different number of epochs from 50 to 10000
Tried to normalize the data

I noticed that after a bunch of back propagations for just one pattern, the network produces almost the same output for large variety of inputs.

When I try to approximate a function, I always get just line (almost a line). Like this:

ANN fit

Related question: Neural Network Always Produces Same/Similar Outputs for Any Input And the suggestion to add bias neurons didn't solve my problem.

I found a post like:

When ANNs have trouble learning they often just learn to output the
average output values, regardless of the inputs. I don't know if this 
is the case or why it would be happening with such a simple NN.

which describes my situation closely enough. But how to deal with it?

I am coming to a conclusion that the situation I encounter has the right to be. Really, for each net configuration, one may just "cut" all the connections up to the output layer. This is really possible, for example, by setting all hidden weights to near-zero or setting biases at some insane values in order to oversaturate the hidden layer and make the output independent from the input. After that, we are free to adjust the output layer so that it just reproduces the output as is independently from the input. In batch learning, what happens is that the gradients get averaged and the net reproduces just the mean of the targets. The inputs do not play ANY role.

You need to divide the `accumulated delta's` by the number of patterns. — Thomas Jungblut, Apr 30 '15 at 17:50

Guillaume Chevalier · Answer 1 · 2015-05-03T20:16:19.527

1

My answer can not be fully precise because you have not posted the content of the functions perceptron(...) and backpropagation(...).

But from what I guess, you train your network many times on ONE data, then completely on ONE other in a loop for data in training_data, which leads that your network will only remember the last one. Instead, try training your network on every data once, then do that again many times (invert the order of your nested loops).

In other word, the for I = 1:number of patterns loop should be inside the backpropagation(...) function's loop, so this function should contain two loops.

EXAMPLE (in C#):

Here are some parts of a backpropagation function, I simplified it here. At each update of the weights and biases, the entire network is "propagated". The following code can be found at this URL: https://visualstudiomagazine.com/articles/2015/04/01/back-propagation-using-c.aspx

public double[] Train(double[][] trainData, int maxEpochs, double learnRate, double momentum)
{
    //...
    Shuffle(sequence); // visit each training data in random order
    for (int ii = 0; ii < trainData.Length; ++ii)
    {
        //...
        ComputeOutputs(xValues); // copy xValues in, compute outputs 
        //...
        // Find new weights and biases
        // Update weights and biases
        //...
    } // each training item
}

Maybe what is not working is just that you want to enclose everything after this comment (in Batch learn as an example) with a secondary for loop to do multiple epochs of learning:

%--------------------------------------------------------------------------
%% Get all updates

edited May 03 '15 at 20:16

answered May 02 '15 at 05:24

Guillaume Chevalier

9,613
8
51
79

You may also be suffering of overfitting, try reducing the amount of neurons if my suggestion does not fix the problem. Why? An error function's result in a normal, imperfect, real statistical scenario normally does not perfectly equals zero (but your graphic is not zoomed). But this depends upon your training data's cleanliness, may it fit the function perfectly. – Guillaume Chevalier May 02 '15 at 05:30
I guess that's not how batch learning works, right? Batch learning is meant to use the whole training set in an epoch and update by the sum of updates. However, I checked such a solution, when you update the net after each data. My data-to-fit you can see on the picture. Also, I'm going to pot the code. Really interested what the issue is. – Rubi Shnol May 02 '15 at 05:57
Meanwhile I just wanted to notice that that code is for one epoch only. It's repeated again for all data. – Rubi Shnol May 02 '15 at 06:10
I updated my answer. Maybe this will answer your "thresholds" question. I am not sure what you mean by threshold, but for each data learned, the error function must be recalculated from the input to the output to be derived. – Guillaume Chevalier May 02 '15 at 06:50
I have already used the method when weights updated for every pattern, but it didn't help either. But notice, THIS IS NOT HOW BATCH LEARNING IS SUPPOSED TO WORK. Updates should be accumulated instead. If I don't use weights for the input layer, does it mean that they just stay at 1 all the time? But in that case the input neurons might be oversaturated. Also, I tried to initialize input weights at small random values and did BP w/o input weight update. No result. – Rubi Shnol May 03 '15 at 06:56
I see, I was wrong about batch learning. I edited my answer again. Maybe you just want to do this batch learning multiple times, "epochs". – Guillaume Chevalier May 03 '15 at 20:17
Thanx for the comment. Actually, I do enclose it into another for loop for multiple epochs. As you might have seen, the code was for only one epoch. In the test case, I do batch learn for several epochs. – Rubi Shnol May 03 '15 at 20:50

Perceptron learns to reproduce just one pattern all the time

1 Answers1