code:
a = T.vector()
b = T.vector()
loss = T.sum(a-b)
dy = T.grad(loss, a)
d2y = T.grad(loss, dy)
f = theano.function([a,b], y)
print f([.5,.5,.5], [1,0,1])
output:
theano.gradient.DisconnectedInputError: grad method was asked to compute
the gradientwith respect to a variable that is not part of the
computational graph of the cost, or is used only by a non-differentiable
operator: Elemwise{second}.0
how is a derivative of the graph not part of the graph? Is this why scan is used to compute the hessian?