0

everyone I have created a neural network with 1600 input, one hidden layer with different number of neurons nodes and 24 output neurons. My code shown that I can decrease the error each epoch, but the output of hidden layer always is 1. Due to this reason, the weight adjusted always produce same result for my testing data. I try different number of neuron nodes and learning rate in the ANN and also randomly initialize my initial weight. I use sigmoid function as my activate function since my output is either 1 or 0 in different output. May I know that what is the main reason that causes the output of hidden layer always is 1 and how should i solve it? My purpose for this neural network is to recognize 24 hand shape for alphabet, I try intensities data in my first phase of project. I have try 30 hidden neural nodes also 100 neural nodes even 1000 neural nodes but the output of hidden layer still is 1. Due to this reason, all of the outcome in testing data is always similar. I added the code for my network Thanks

g =  inline('logsig(x)');
[row, col] = size(input);
numofInputNeurons = col;

weight_input_hidden = rand(numofInputNeurons, numofFirstHiddenNeurons);
weight_hidden_output = rand(numofFirstHiddenNeurons, numofOutputNeurons);

epochs = 0;    
errorMatrix = [];

while(true)
    if(totalEpochs > 0 && epochs >= totalEpochs)
        break;
    end
    totalError = 0;
    epochs = epochs + 1;
    for i = 1:row
        targetRow = zeros(1, numofOutputNeurons);
        targetRow(1, target(i)) = 1;

        hidden_output = g(input(1, 1:end)*weight_input_hidden);
        final_output = g(hidden_output*weight_hidden_output);

        error = abs(targetRow - final_output);
        error = sum(error);
        totalError = totalError + error;

        if(error ~= 0)
             delta_final_output = learningRate * (targetRow - final_output) .* final_output .* (1 - final_output);
             delta_hidden_output = learningRate * (hidden_output) .* (1-hidden_output) .* (delta_final_output * weight_hidden_output');

            for m = 1:numofFirstHiddenNeurons
                for n = 1:numofOutputNeurons
                    current_changes = delta_final_output(1, n) * hidden_output(1, m);
                    weight_hidden_output(m, n) = weight_hidden_output(m, n) + current_changes; 
                end
            end

            for m = 1:numofInputNeurons
                for n = 1:numofFirstHiddenNeurons
                    current_changes = delta_hidden_output(1, n) * input(1, m);
                    weight_input_hidden(m, n) = weight_input_hidden(m, n) + current_changes;       
                end
            end
        end
    end

    totalError = totalError / (row);
    errorMatrix(end + 1) =  totalError;

    if(errorThreshold > 0 && totalEpochs == 0 && totalError < errorThreshold)
            break;
    end

 end
  • 2
    Please add a [mcve] detailing your code. Explain why it does not work and what you want it to do. – Adriaan Sep 02 '15 at 11:50
  • 1
    The main reason you would get a result like "outputs of hidden layer is always 1" is a bug in your code. To get a better analysis of what might be causing the bug, you need to show your code. – Neil Slater Sep 02 '15 at 12:38
  • Hi, thanks for reply. I added the code in my post. – Tan K.Seang Sep 02 '15 at 14:45
  • As @NeilSlater said, often a bug. Are you trying to make the network output values > 1? – jorgenkg Sep 02 '15 at 19:21

1 Answers1

2

I see a few obvious errors that need fixing in your code:

1) You have no negative weights when initialising. This is likely to get the network stuck. The weight initialisation should be something like:

weight_input_hidden = 0.2 * rand(numofInputNeurons, numofFirstHiddenNeurons) - 0.1;

2) You have not implemented bias. That will severely limit the ability of the network to learn. You should go back to your notes and figure that out, it is usually implemented as an extra column of 1's inserted into input and activation vectors/matrix before determining the activations of each layer, and there should be a matching additional column of weights.

3) Your delta for output layer is wrong. This line

delta_final_output = learningRate * (targetRow - final_output) .* final_output .* (1 - final_output);

. . . is not the delta for the output layer activations. It has some extra unwanted factors.

The correct delta for logloss objective function and sigmoid activation in output layer would be:

delta_final_output = (final_output - targetRow);

There are other possibilities, depending on your objective function, which is not shown. You original code is close to correct for mean squared error, which would probably still work if you changed the sign and removed the factor of learningRate

4) Your delta for hidden layer is wrong. This line:

delta_hidden_output = learningRate * (hidden_output) .* (1-hidden_output) .* (delta_final_output * weight_hidden_output');

. . . is not the delta for the hidden layer activations. You have multiplied by the learningRate for some reason (combined with the other delta that means you have a factor of learningRate squared).

The correct delta would be:

delta_hidden_output = (hidden_output) .* (1-hidden_output) .* (delta_final_output * weight_hidden_output');

5) Your weight update step needs adjusting to match fixes to (3) and (4). These lines:

current_changes = delta_final_output(1, n) * hidden_output(1, m);

would need to be adjusted to get correct sign and learning rate multiplier

current_changes = -learningRate * delta_final_output(1, n) * hidden_output(1, m);

That's 5 bugs from looking through the code, I may have missed some. But I think that's more than enough for now.

Neil Slater
  • 26,512
  • 6
  • 76
  • 94
  • Hello, thanks for reply, it saves me, may i know that is it normal when my neural network decrease error and increase error for many epochs. Thanks – Tan K.Seang Sep 03 '15 at 04:41
  • @TanK.Seang: In basic gradient descent approaches then the training error value may go up and down. A good pattern is that the error should go down quickly to start, then may start to vary (sometimes up, sometimes down) but still on average get lower over many epochs. If the value starts to rise continuously then you may have a problem such as learning rate is too high, or maybe a bug. – Neil Slater Sep 03 '15 at 06:41
  • Thanks again, may i know that is it also possible because of my training data is not good enough to be classified? This is because my recognition project is to recognize 24 static ASL sign from image and some shapes might similar, so that is my doubt. Thanks a lot – Tan K.Seang Sep 03 '15 at 07:15
  • Because I'm only using raw pixels value, so I'm thinking that it might due to my training data. Because I try one-hidden layer and two-hidden layer also same situation happen which the error value keep increasing and drop some time. – Tan K.Seang Sep 03 '15 at 07:27
  • You could try to normalise your training data. Take away the mean value of all pixels, and divide by the standard deviation (based on values in your training data). Also, very soon you will want to start splitting your training data, so you get a better measure of how the network is behaving in general, not just on the training set. Are you following a course? These things are usually explained as you go. – Neil Slater Sep 03 '15 at 07:43
  • ya, i actually try both normalization also without normalization on my training data set but it also produces same result which the error is decreasing at first and then it will start to increase at a point then sometimes it will decreased but soon it increase again. Yes, I'm taking the course but some practical issues confuse me. Thanks for your reply. – Tan K.Seang Sep 03 '15 at 08:06
  • Here's an answer I wrote a while ago, that includes a Matlab implementation for comparison: http://stackoverflow.com/questions/27038302/back-propagation-algorithm-error-computation/27039952#27039952 – Neil Slater Sep 03 '15 at 08:15