0

I have been trying to work on a code that allows the user to perform a neural net analysis with the number of layers and number of nodes per layer of the user's choice. While working on the code, however, I have stumbled in on a bit of a problem. I decided to go the scipy.optimize.minimize way to minimize the cost/objective function of the Neural Network. I tried using the 'TNC' method, but it seems to be giving unstable results, sometimes giving results of high accuracy and at times of low accuracy. The line of code with scipy.optimize is given below

from scipy.optimize import minimize

fmin = minimize(fun=computeCostandGrad,x0=theta_unroll,
                args(input_size,hidden_size,n_hidden,n_labels,X,y,lmbda),
                method='TNC',jac=True)

The computeCostandGrad function returns both the Cost function value and the derivatives of the parameters of the neural network. Now I am sure both of these work properly as I ran the code using known data and output values and also tested the back-propagation with gradient checking. The initial guess x0=theta_unroll is a numpy array of shape: (1,10285).

Seeing that the TNC method was giving kind of unstable results, I attempted to change the method to BFGS. But this issue pops up.

ValueError: shapes (10285,10285) and (1,10285) not aligned: 10285 (dim 1) != 1 (dim 0)

Now, I know that the computeCostandGrad function is correct as I have explained before. And I had no such issues when I used the 'TNC' method. Why am I having this issue when I'm trying to run BFGS alone? When I tried the Newton-CG error another issue popped up.

ValueError: setting an array element with a sequence.

I am confused. Is the error because of how I've used scipy.optimize.minimize or is there any issue in my computeCostandGrad function? Please help me out! I have attached the link using which you can see my code below.

https://github.com/Gauthi-BP/Neural-network-Generalized/blob/master/Neural%20Network.ipynb

Panagiotis Simakis
  • 1,245
  • 1
  • 18
  • 45
  • Hi @Gautham and welcome to SO. Please provide the source code or import of `computeCostandGrad` in order to make this case reproducible. Did you take `computeCostandGrad` from [here](https://github.com/rksltnl/RNTN/blob/master/ComputeCostAndGrad.py#L6) ? – Panagiotis Simakis Jul 10 '20 at 07:50
  • The first error is the kind produced by `np.dot` (or `@`). Remember matrix multiplication rule is "last dim of A with 2nd to last of B". The second occurs when you try to assign an array (or list) to an element of numeric array. "Many" does not fit into space for "one". – hpaulj Jul 10 '20 at 08:02
  • @PanagiotisSimakis I wrote the function of computeCostandGrad on my own, the code can be seen in the link that I have provided in the question – Gautham Jul 10 '20 at 11:49
  • @hpaulj yes I understand what the errors mean in general but I am not able to figure out where in my problem this applies. What stumps me is that the code does work when I use the 'TNC' method so why would I get this error when I change the method for minimization? – Gautham Jul 10 '20 at 11:50

0 Answers0