Most efficient way to calculate hessian of cost function in neural network

Question

I am coding a MLP network and I would like to implement the levenberg-marquardt algorithm. With levenberg-marquardt, the weights' update after each iteration is given by this formula:

W(t+1) = W(t) - (H(t)+ l(t)*I)^-1 * J

// W(t) is the matrix of weight at instant t
// H(t) is the `hessian` of the cost function
// l(t) is a learning rate
// J is the gradient of the cost function.

But I can't find an algorithm to calculate (or to have a acceptable estimation of) the hessian. How can I do that?

It's not as much of an algorithm as it is calculus. The hessian is just a fancy term for second order derivatives. Just as you need to compute the derivatives of the cost function for basic gradient descent, you need to compute the second order derivatives for this algorithm. It gets very ugly for neural networks. I could find this http://research.microsoft.com/pubs/67174/bishop-hessian-nc-92.pdf - perhaps you can find more. — IVlad, Aug 14 '15 at 16:06
I know how to calculate the second order derivatives for the last layer but it gets much more complicated with other layers. The second order derivative of a layer can be expressed in function of the next layer but I don't know how to do that. That's the same principle as the retropropagation of the grandient but I would like to know how to implement it. Thanks — Arkhan, Aug 14 '15 at 18:41
I don't think any package in R has levenberg-marquardt algorithm for training a neural net. Even the latest 'RSNNS' package has only Std Backpropagation, BackpropMomentum and Rprop algorithms for training. You will code the algorithm yourself. — Gaurav, Aug 17 '15 at 06:51

Most efficient way to calculate hessian of cost function in neural network

0 Answers0