2

I have a loss function and a list of weightmatrices and I'm trying to compute second derivatives. Here's a code snippet:

loss.backward(retain_graph=True)
grad_params_w=torch.autograd.grad(loss, weight_list,create_graph=True)

for i in range(layers[a]): 
       for j in range (layers[a+1]): 
          second_der=torch.autograd.grad(grad_params_w[a][i,j], my_weight_list[b], create_graph=True) 

The above code works (actually the second derivative is called in a seperate function but I put it directly for the sake of brevity). But I am completely confused as to when to use create and retain graph.

First: If I don't do loss.backward(retain_graph) I get the error A:

RuntimeError: element 0 of variables tuple is volatile

If I use it, but don't put any "graph" statement on the first derivative, I get the error B:

RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed. Specify retain_graph=True when calling backward the first time.

If I specify retain_graph=True, I get error A for the second derivative (i.e. in the for loops) no matter if I put a create graph statement there or not.

Hence, only the above snippet works but it feels weird that I need loss.backward and all the create graph statement. Could somebody clarify this to me? Thanks a lot in advance!!

Ioannis Nasios
  • 8,292
  • 4
  • 33
  • 55
Jonasson
  • 293
  • 1
  • 2
  • 11
  • Please post a [**Minimal**, Complete, and Verifiable example](https://stackoverflow.com/help/mcve). – Alex May 17 '18 at 16:55

0 Answers0