1

I am trying to implement a 3 layer neural network with feedforward and backpropagation.
I have tested my cost function and it is working fine. My gradient function also seems ok.

but when I try to optimize variable using fmin_cg from scipy, I get this warning :
Warning: Desired error not necessarily achieved due to precision loss. Current function value: 4.643489 Iterations: 1 Function evaluations: 123 Gradient evaluations: 110

I searched about this and someone told the problem is with gradient. This is my code for gradient:

theta_flatten = theta_flatten.reshape(1,-1)

# retrieve theta values from flattened theta 
theta_hidden = theta_flatten[0,0:((input_layer_size+1)*hidden_layer_size)]
theta_hidden = theta_hidden.reshape((input_layer_size+1),hidden_layer_size)

theta_output = theta_flatten[0,((input_layer_size+1)*hidden_layer_size):]
theta_output = theta_output.reshape(hidden_layer_size+1,num_labels)

# start of section 1
a1 = x # 5000x401
z2 = np.dot(a1,theta_hidden) # 5000x25
a2 = sigmoid(z2)

a2 = np.append(np.ones(shape=(a1.shape[0],1)),a2,axis = 1) # 5000x26 # adding column of 1's to a2
z3 = np.dot(a2,theta_output) # 5000x10
a3 = sigmoid(z3) # a3 = h(x) w.r.t theta

a3 = rotate_column(a3) # mapping 0 to "0" instead of 0 to "10"
# end of section 1

# strat of section 2
delta3 = a3 - y # 5000x10
# end of section 2

# start of section 3
delta2 = (np.dot(delta3,theta_output.transpose()))[:,1:] # 5000x25 # drop delta2(0)
delta2 = delta2*sigmoid_gradient(z2)
# end of section 3

# start of section 4
DELTA2 = np.dot(a2.transpose(),delta3) # 26x10
DELTA1 = np.dot(a1.transpose(),delta2) # 401x25
# end of section 4

# start of section 5
theta_hidden_ = np.append(np.ones(shape=(theta_hidden.shape[0],1)),theta_hidden[:,1:],axis = 1) # regularization
theta_output_ = np.append(np.ones(shape=(theta_output.shape[0],1)),theta_output[:,1:],axis = 1) # regularization

D1 = DELTA1/a1.shape[0] + (theta_hidden_*lambda_)/a1.shape[0]
D2 = DELTA2/a1.shape[0] + (theta_output_*lambda_)/a1.shape[0]
# end of section 5

Dvec = np.append(D1,D2)

return Dvec

I look at github for other people implementations, but nothing works, and they implemented like me.
some comments :
Section one: feedforward implementation
Section two to four: backpropagation from ouput layer to input layer
Section five: aggregating gradients
Please help
Thank you

  • [Link](https://github.com/Nikronic/Coursera-Machine-Learning/blob/master/Week%205%20-%20Neural%20Networks%20Learning/Backpropagation.ipynb) for full code. – M. Doosti Lakhani Oct 02 '18 at 22:35
  • 1
    According to [this](https://stackoverflow.com/questions/24767191/scipy-is-not-optimizing-and-returns-desired-error-not-necessarily-achieved-due?rq=1) answer, that error can occur when your values go into the negative. Can you check your values as you go? – G. Anderson Oct 02 '18 at 22:41
  • I saw that post before asking this question. How I can separately check my gradient working fine? – M. Doosti Lakhani Oct 03 '18 at 05:41
  • The best thing I can suggest is to either output your deltas/thetas/outputs to a file or print to console and see if/when they go negative. Unfortunately, I don;t have an answer to post as I haven't done this, but hopefully that will get you on the right track to figuring out if/why the values are going wrong – G. Anderson Oct 03 '18 at 14:50

0 Answers0