0

I'm trying to implement some of the AD algorithms myself but I don't quite get the edge pushing algorithm by Gower and Mello for computing sparse Hessian.

Does a new computational graph of the "original gradient" need to be generated (for example should the graph (2*x) be generated when calculating (x^2) in order to find the second derivative (2)) since the paper states that the dotted arcs represent "non-linear interactions" and how exactly are the adjoints accumulated to form the second derivative?

Also, if a new graph is needed, how does that differ from symbolic differentiation? Thanks!

thoughtpolice
  • 479
  • 4
  • 13

1 Answers1

1

No new graph needs to be generated. Instead, only the nonlinear edges need to be "added" to the original computational graph. I say "added" because really, you need only transverse the computational graph in reverse order, and add the nonlinear edge on the fly when you find a nonlinear interaction between the predecessors of the node. I'll upload some slides detailing this tomorrow to my webpage: https://gowerrobert.github.io/