Before I begin, I'd just like to preface this by saying that I only started coding in October so excuse me if it's a little be clumsy.
I've been trying to make a MLP for a project I've been doing. I have the hidden layer(Sigmoid) and output layer(softmax) and they seem to be working properly. However, when I computed the back propagation, my error initially decreases and then alternates between two different values.
See image below, graph of epoch(x) against error(y).
I have tried multiple different learning rates, different numbers of epochs, different random initial weights. Everything I can think of, but I keep getting the same problem. I have also normalised the data and the targets to values between 0-1.
After lowering the learning rate considerably, I get a smoother graph but still there isn't a lot of error reduction after the first few epochs. For example, when using 10/50/100 epochs of training there is very little reduction after 4-5 epochs. Have I hit a local minimum, can I improve this
Can anyone shed some light on why this is happening and suggest so code that can resolve the problem? I would really appreciate it
I have enclosed the code I used for the Back Propagation algorithm.
function [deltaW1, deltaW2,error] = BackProp(input,lr,MLPout,weightsL2,targ,outunits,outofhid)
%This function returns a new set of updated weights for layer 1 and layer 2
% These weights replace the previous set to improve generalisation
%% Finding the derivative of the Sigmoid
DerivativeS = outofhid.*(ones(size(outofhid)) - outofhid);
%% Finding delta2
error = zeros(10,length(MLPout));
for y = 1:length(outunits)
for j = 1:length(MLPout)
error(y,j) = (targ(j) - MLPout(y,j));
end
end
%finding delta x weights two - the new weights between the hidden layer and output layer
deltaW2 = lr.*(error*outofhid');
%finding delta one - the new weights between the input layer and hidden layer
deltaW1 = lr*(((error'*weightsL2').*DerivativeS')'*input);
deltaW1 = deltaW1';
end