Using autograd.grad() as a parameter for a loss function (pytorch)

Question

I want to compute the gradient between two tensors in a net. The input X tensor is sent through a set of convolutional layers which give me back and output Y tensor.

I’m creating a new loss and I would like to know the MSE between gradient of norm(Y) w.r.t. each element of X. Here the code:

# Staring tensors
X = torch.rand(40, requires_grad=True)
Y = torch.rand(40, requires_grad=True)

# Define loss
loss_fn = nn.MSELoss()

#Make some calculations
V = Y*X+2

# Compute the norm
V_norm = V.norm()

# Computing gradient to calculate the loss
for i in range(len(V)):
    if i == 0:
        grad_tensor = torch.autograd.grad(outputs=V_norm, inputs=X[i])
    else:
        grad_tensor_ = torch.autograd.grad(outputs=V_norm, inputs=X[i])
        grad_tensor = torch.cat((grad_tensor, grad_tensor_), dim=0)

# Grund truth
gt = grad_tensor * 0 + 1

#Loss
loss_g = loss_fn(grad_tensor, gt)
print(loss_g)

Unfortunately, I’ve been making tests with torch.autograd.grad(), but I could not figure out how to do it. I get teh following error: RuntimeError: One of the differentiated Tensors appears to not have been used in the graph. Set allow_unused=True if this is the desired behavior.

Setting allow_unused=True gives me back None which is not an option. Not sure how to compute the loss between the gradients and the norm. Any idea about how to code this loss?

trsvchn · Accepted Answer · 2019-02-19T17:40:34.487

You are receiving mentioned error because you are trying to feed a slice of tensor X: X[i] to grad(), and it is going to be considered as a separate tensor, outside of your main computational graph. Not sure, but seems it returns new tensor while performing slicing.

But you don't need a for loop to compute gradients:

Code:

import torch
import torch.nn as nn

torch.manual_seed(42)

# Create some data.
X = torch.rand(40, requires_grad=True)
Y = torch.rand(40, requires_grad=True)

# Define loss.
loss_fn = nn.MSELoss()

# Do some computations.
V = Y * X + 2

# Compute the norm.
V_norm = V.norm()

print(f'V norm: {V_norm}')

# Computing gradient to calculate the loss
grad_tensor = torch.autograd.grad(outputs=V_norm, inputs=X)[0]  # [0] - Because grad returs tuple, so we need to unpack it
print(f'grad_tensor:\n {grad_tensor}')

# Grund truth
gt = grad_tensor * 0 + 1

loss_g = loss_fn(grad_tensor, gt)
print(f'loss_g: {loss_g}')

Output:

V norm: 14.54827

grad_tensor:
    tensor([0.1116, 0.0584, 0.1109, 0.1892, 0.1252, 0.0420, 0.1194, 0.1000, 0.1404,
            0.0272, 0.0007, 0.0460, 0.0168, 0.1575, 0.1097, 0.1120, 0.1168, 0.0771,
            0.1371, 0.0208, 0.0783, 0.0226, 0.0987, 0.0512, 0.0929, 0.0573, 0.1464,
            0.0286, 0.0293, 0.0278, 0.1896, 0.0939, 0.1935, 0.0123, 0.0006, 0.0156,
            0.0236, 0.1272, 0.1109, 0.1456])

loss_g: 0.841885

Loss between the grads and the norm

You also mentioned that you want to compute loss between the gradients and the norm, it is possible. And there are two possible options of it:

You want to include your loss calculation to your computational graph, in this case use:

loss_norm_vs_grads = loss_fn(torch.ones_like(grad_tensor) * V_norm, grad_tensor)

You want just to compute loss and you don't want to start backward path from the loss, in this case don't forget to use torch.no_grad(), otherwise autograd will track this changes and add loss computation to your computational graph.

with torch.no_grad():
    loss_norm_vs_grads = loss_fn(torch.ones_like(grad_tensor) * V_norm, grad_tensor)

Using autograd.grad() as a parameter for a loss function (pytorch)

1 Answers1

Linked