0

I have 200 training examples. I have run linear regression with 6 features on this dataset and it works fine, so I want to run nueral networs on it too.

Problem: each time I run the program, the prediction (pred) is different, vastly different!

input_layer_size  = 6;
hidden_layer_size = 3;   
num_labels = 1;

% Load Training Data

load('capitaldata.mat');

% example size

m = size(X, 1);

% initialize theta

initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);

% Unroll parameters

initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];

% find optimal theta

options = optimset('MaxIter', 50);

%  set regularization parameter

lambda = 1;

% Create "short hand" for the cost function to be minimized

costFunction = @(p) nnCostFunctionLinear(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda);

% Now, costFunction is a function that takes in only one argument (the neural network parameters)

[nn_params, cost] = fmincg(costFunction, initial_nn_params, options);

% Obtain Theta1 and Theta2 back from nn_params

Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), hidden_layer_size, (input_layer_size + 1));

Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), num_labels, (hidden_layer_size + 1));

% test case
test = [18 279 86 59 23 16]; 

pred = predict(Theta1, Theta2, test);

display(pred);

Functions that are called by the above program:

1) randInitializeWeights.m

function W = randInitializeWeights(L_in, L_out)

W = zeros(L_out, 1 + L_in);

epsilon_init = 0.12;

W = rand(L_out , 1 + L_in)  * 2 * epsilon_init - epsilon_init;

end;

2) nnCostFunctionLinear.m should be right since the test result is correct. Let me know if you would like to see it too.

I suspect that the problem is the dataset size, the number of features, or the initialize weights.

Thank you in advance for your help!

  • I am not familiar with octave, but this seems like a problem of random numbers. The results are bound to be different because initial weights are randomized each turn. But you say that they are vastly different. Can you describe a bit more, probably add the results here – Vivek Kumar Apr 20 '17 at 06:20
  • Yes definitely: the predictions were 2.1687e+004, -2.4438e+004, -7226.6, etc..while results should be around 31. I eyed on randomization too, but what I learned from Coursera's Machine Learning was that back propagation needed ramdom theta rather than all-zero theta, otherwise it would be stuck in a saddle point... :( really confused! – Mathfish Apr 20 '17 at 06:32
  • I am confused. Are you saying that the predictions were vastly different than the actual data, or predictions were vastly different from each other in different runs of algorithm? – Vivek Kumar Apr 20 '17 at 06:36
  • Both, unfortunately... It is sometimes positive and sometimes negative, but in the training examples, they are never negative (they are between 10~200). – Mathfish Apr 20 '17 at 06:39

1 Answers1

0

As a test, you can seed the random number generator with the same value each time to give the same sequence of random numbers each time. Search for

random seed

and the name of the software you are using to find how to set the seed for the random number generator.

James Phillips
  • 4,526
  • 3
  • 13
  • 11
  • Would you mind explaining a little bit? I am using Octave and did check random seed, but I don't know what it does with potential bug of this program.. Thank you in advance! – Mathfish Apr 21 '17 at 02:19
  • i found online that having the same random seed is equivalent of having a secret key -- did you mean that? I am wondering how that would help debugging though. Thanks! – Mathfish Apr 21 '17 at 02:25
  • I wrote "as a test", so to directly answer your question - no, I did not mean that. – James Phillips Apr 21 '17 at 09:41