2

I am trying to implement the stochastic diagonal Levenberg-Marquardt method for Convolutional Neural Network in order to back propagate for learning weights. i am new in it, and quite confused in it, so I have few questions, i hope you may help me.

1) How can i calculate the second order derivative at output layer from the two outputs. As i in first order derivative i have to subtract output from desired output and multiply it with derivative of the output. But in second derivative how can i do that?

2) In MaxPooling layer of convolutional Neural Network, I select max value in 2x2 window, and multiply it with the weight, now Does i have to pass it through activation function or not?

Can some one give me explanation how to do it in opencv, or how with mathematical explanation or any reference which show the mathematics. thanks in advance.

Amir
  • 10,600
  • 9
  • 48
  • 75
khan
  • 531
  • 6
  • 29
  • This is more related to math.stackexchange.com – Marco A. Mar 29 '14 at 11:09
  • can you please explain what is your function, data, variables, what derivative (with relation to what variable) do you want to calculate? Then I can help you. – 4pie0 Mar 29 '14 at 11:20

1 Answers1

2

If you have calculated Jacobian matrix already (the matrix of partial first order derivatives) then you can obtain an approximation of the Hessian (the matrix of partial second order derivatives) by multiplying J^T*J (if residuals are small).

You can calculate second derivative from two outputs: y and f(X) and Jacobian this way:

enter image description here

In other words Hessian approximation B is chosen to satisfy:

enter image description here

In this paper you can find more about it. Ananth Ranganathan: The Levenberg-Marquardt Algorithm

4pie0
  • 29,204
  • 9
  • 82
  • 118
  • Sorry for being late... Thanks for your reply. DO you mean the first order derivative is called Jacabion Matrix? If yes than yes i do have that. I am using sigmoidal function at output layer. So you mean Hessian is the Transpose multiplied by the original matrix. (My output is 2x1 Mat. [0,1] or [1,0] so do you mean that 2x1 is my Hessian Matrix) – khan Mar 29 '14 at 14:05
  • 1
    @khan Yes, Jacobian matrix is the matrix of first order partial derivatives, and [Hessian](http://en.wikipedia.org/wiki/Hessian_matrix) is matrix of partial second order derivatives. These matrices are squared. – 4pie0 Mar 29 '14 at 15:39
  • http://enpub.fulton.asu.edu/cseml/summer08/papers/cnn-appendix.pdf On page 2320 equation (27) , is this the same as what i am asking? Should I follow these steps to find secoond order derivative : 1) Hessian Matrix = J^T * J where J is first order partial derivative of output with respect to input. THis will give me the last part of the equation. Am i right? THanks in advance, as i am pretty much confused at this point – khan Mar 29 '14 at 15:50
  • THanks alot, So nice of you. I have to try it now... :) – khan Mar 29 '14 at 16:10
  • I thought when my program runs, and when i have to close my question than i will accept your answer, i mean if there are no more questions from my side. I am doing the coding right now, If i accept your answer and than if i ask a question will you receive notification? or will it ends any notification to you? – khan Mar 29 '14 at 22:56
  • 1
    @khan I will, you can also add "@theNameOfUser" to ping a user with a name theNameOfUser – 4pie0 Mar 29 '14 at 23:28
  • @khan not a problem at all – 4pie0 Mar 30 '14 at 00:10
  • Hi, one question regarding Hessian matrix, i have a 2x1 output so when i calculate Hessian of that (using j^T*j), it give me one 1x1 hessian matrix. but if i use j*j^t it give me 2x2. Does j*j^T and j^T*j have same affect? (by the way if i use prior it give me error as dont have second diagonal value so currently i am using second one having 2x2 output)? But i am in doubt and thinking will the result be correct. THanks in advance – khan Apr 08 '14 at 03:42
  • [Hessian](http://en.wikipedia.org/wiki/Hessian_matrix) of n variables is squared nxn matrix, [Jacobian](http://en.wikipedia.org/wiki/Jacobian_matrix_and_determinant) is 1xn (in case of one function) so yes you need to multiply 2x1 * 1x2 to get 2x2 Hessian matrix – 4pie0 Apr 08 '14 at 09:38