0

I'm writing this code for learning process of ANN (multi-layer back-propagation ) but the result of learning is very bad it's not near to 1 at any time I know we can not give any guaranty to make learning successfully but I want to know if I make something error in this code or if I can make these steps with more performance.

Steps:

1- upload my dataset

2- choose 170 rows from 225 for learning and the rest 50 rows for testing ( randomly )

3- create weights for inputs and hidden layers randomly between 0 and 1

4- create Bias for hidden and output layers randomly between -1 and 1

5- find the output for each row

6- find error for each output then for each hidden layer

7- update weights and bias arrays at each iteration

8- compute the summation of square error ( MSE) at each iteration .

The result for each output always between .2 and .5 also not for the desired output. What it is the possible error at my logic or at my code here!!

Notes: 1- ( I'm using data set with 225 rows with 108 columns and with 25 results as classes) 170 rows for learning 55 rows for testing

2- 50,000 iterations

3- learning rate 0.3

4- momentum = 0.7

5- hidden layer ne. no = 90

Code:

%Initialize the weight matrices with random weights

V = rand(inlayer,hlayer); % Weight matrix from Input to Hidden between [0,1]

W = rand(hlayer,olayer); % Weight matrix from Hidden to Output between [0,1]

%Initialize the theta matrices for hidden and output layers
Thetahidden = randi(1,hlayer);
Thetaoutput = randi(1,olayer);

for i=1:iteration


for j=1:170 % depends on training data set
     %This for output between input-hidden
    for h=1:hlayer % depends on neuron number at hidden layer
        sum = 0;
        for k=1:108 % depends on column number
            sum = sum + (V(k,h)* trainingdata(j,k));
        end
        H(h) = sum + Thetahidden(h);
        Oh(h) = 1/(1+exp(-H(h)));
    end
    %This for output between hidden-output
    for o=1:olayer % depends on number of output layer
        sumO = 0;
        for hh=1:hlayer
            sumO = sumO+W(hh,o)*Oh(hh);
        end
        O(o)=sumO + Thetaoutput(o);
        OO(o) = 1/(1+exp(-O(o)));

        finaloutputforeachrow(j,o)= OO(o);

    end

    % Store real value of real output
    for r=1:170
        for o=1:olayer
            i=outputtrainingdata(r);
        if i == o
            RO(r,o)=1;
        else
            RO(r,o)=0;
        end
        end
    end


    sumerror =0;


    % Compute Error ( output layer )
    for errorout=1:olayer

        lamdaout(errorout) = OO(errorout)*(1-OO(errorout))*(RO(j,errorout)-OO(errorout));
        errorrate = RO(j,errorout)-OO(errorout);
        sumerror = sumerror+(errorrate^2);
        FinalError(j,errorout) = errorrate;
    % Compute Error ( hidden layer )
    ersum=0;
    for errorh=1:hlayer
        ersum= lamdaout(errorout)*W(errorh,errorout);
        lamdahidden(errorh)= Oh(errorh)*(1-Oh(errorh))*ersum;
    end
    FinalSumError(j) = (1/2)*sumerror;
    end

     %update weights between input and hidden layer
     for h=1:hlayer
         for k=1:108
             deltaw(k,h) = learningrate*lamdahidden(h)*trainingdata(j,k);
             V(k,h) = (m*V(k,h)) + deltaw(k,h);
         end
     end

     %update weights/Theta between hidden and output layer
     for h=1:hlayer
         for outl=1:olayer
             %weight
             deltaw2(h,outl) = learningrate * lamdaout(outl)*Oh(h);
             W(h,outl)= (m*W(h,outl))+deltaw2(h,outl);

         end
     end

     for h=1:hlayer
          %Theta-Hidden
             deltaHiddenTh(h) = learningrate * lamdahidden(h);
             Thetahidden(h) = (m*Thetahidden(h)) + deltaHiddenTh(h);
     end

    for outl=1:olayer

             %Theta-Output
             deltaOutputTh(outl) = learningrate * lamdaout(outl);
             Thetaoutput(outl) = (m*Thetaoutput(outl)) + deltaOutputTh(outl);
    end





end


end
Samah Ahmed
  • 419
  • 8
  • 24

1 Answers1

1

There's lots of things that influence the performance (and ultimately the convergence) of neural networks. Apart from having a closest look at your code and be sure that the process is correctly coded, you might have some ideas to play around and think about:

  • The range of initialization of the weights should be related to the inputs that the net is going to process have a look here. Is there a reason why in the range [0,1] when the inputs are in the range [-1,1]?

  • The momentum value can have a huge effect on the convergence. Try different values.

  • A nice practice to have a feeling of a nice learning process is to plot the learning curve, i.e. the error (MSE in your case) against the training epoch. There are healthy patterns that can give you a hint on what's happening.

  • The fact that the initial weights are randomly set can lead (depending on the problem complexity) to different convergence points. It is helpful to have an idea on how big this difference can be: Just train the net, train it again a number of times and plot the differences.

There's still the number of hidden neurons, and all sort of screws you need to adjust before it works, but at a first glance, it seems like you have a very complex problem (108 columns and 25 results as classes) with a too small set of data to train (225). If this the case, maybe you need more data... Or try another type of model, instead of neural networks.

Hope it helps, have fun!

Luis
  • 3,327
  • 6
  • 35
  • 62