I have been trying to work on a code that allows the user to perform a neural net analysis with the number of layers and number of nodes per layer of the user's choice. While working on the code, however, I have stumbled in on a bit of a problem. I decided to go the scipy.optimize.minimize
way to minimize the cost/objective function of the Neural Network. I tried using the 'TNC' method, but it seems to be giving unstable results, sometimes giving results of high accuracy and at times of low accuracy. The line of code with scipy.optimize is
given below
from scipy.optimize import minimize
fmin = minimize(fun=computeCostandGrad,x0=theta_unroll,
args(input_size,hidden_size,n_hidden,n_labels,X,y,lmbda),
method='TNC',jac=True)
The computeCostandGrad
function returns both the Cost function value and the derivatives of the parameters of the neural network. Now I am sure both of these work properly as I ran the code using known data and output values and also tested the back-propagation with gradient checking. The initial guess x0=theta_unroll
is a numpy array of shape: (1,10285)
.
Seeing that the TNC method was giving kind of unstable results, I attempted to change the method to BFGS. But this issue pops up.
ValueError: shapes (10285,10285) and (1,10285) not aligned: 10285 (dim 1) != 1 (dim 0)
Now, I know that the computeCostandGrad
function is correct as I have explained before. And I had no such issues when I used the 'TNC' method. Why am I having this issue when I'm trying to run BFGS alone? When I tried the Newton-CG error another issue popped up.
ValueError: setting an array element with a sequence.
I am confused. Is the error because of how I've used scipy.optimize.minimize
or is there any issue in my computeCostandGrad
function? Please help me out! I have attached the link using which you can see my code below.
https://github.com/Gauthi-BP/Neural-network-Generalized/blob/master/Neural%20Network.ipynb