My goal is to understand how to use torch.autograd.grad.
Documentation gives a description as
torch.autograd.grad(
outputs
,
inputs
,
grad_outputs=None
,
retain_graph=None
,
create_graph=False
,
only_inputs=True
,
allow_unused=False
,
is_grads_batched=False
)
Computes and returns the sum of gradients of outputs with respect to the inputs.
so I made this derivative function:
def derivative(y, t):
return torch.autograd.grad(y, t)
And can return gradient for y if y is scalar or sum of gradients of each component of y wrt vector t
def f(x):
return (x**2).sum()
x = torch.tensor([1, 2, 3],dtype = torch.float, requires_grad = True)
derivative(f(x), x)
# returns 2*x
def f(x):
return (x[0], x[1], x[2], x[0])
x = torch.tensor([1, 2, 3],dtype = torch.float, requires_grad = True)
derivative(f(x), x)
# returns (tensor([2., 1., 1.]),)
But it doesn't work for y values that are tensors in t, nor I can get the full Jacobian.
Could you show how to get gradients of tensor outputs wrt to tensor inputs or how to get Jacobian?