Autograd can automatically differentiate native Python and Numpy code and is also used by the deep learning framework PyTorch. It can handle a large subset of Python's features, including loops, ifs, recursion and closures, and it can also take derivatives of derivatives of derivatives. The main intended application of Autograd is gradient-based optimization.
Questions tagged [autograd]
362 questions
7
votes
1 answer
What is the LibTorch equivalent to PyTorch's torch.no_grad?
When testing a network in PyTorch one can use with torch.no_grad():. What is the Libtorch (C++) equivalent?
Thanks!

MD98
- 344
- 2
- 9
7
votes
1 answer
Pytorch Autograd: what does runtime error "grad can be implicitly created only for scalar outputs" mean
I am trying to understand Pytorch autograd in depth; I would like to observe the gradient of a simple tensor after going through a sigmoid function as below:
import torch
from torch import autograd
D = torch.arange(-8, 8, 0.1,…

A.E
- 997
- 1
- 16
- 33
7
votes
1 answer
Mini batch training for inputs of variable sizes
I have a list of LongTensors, and another list of labels. I'm new to PyTorch and RNN's so I'm quite confused as to how to implement minibatch training for the data I have. There is much more to this data, but I want to keep it simple, so I can…

Venkat
- 446
- 1
- 4
- 15
7
votes
1 answer
Implementing Adagrad in Python
I'm trying to implement Adagrad in Python. For learning purposes, I am using matrix factorisation as an example. I'd be using Autograd for computing the gradients.
My main question is if the implementation is fine.
Problem description
Given a matrix…

Nipun Batra
- 11,007
- 11
- 52
- 77
6
votes
1 answer
Get positive and negative part of gradient for loss function in PyTorch
I want to implement non-negative matrix factorization using PyTorch. Here is my initial implement:
def nmf(X, k, lr, epochs):
# X: input matrix of size (m, n)
# k: number of latent factors
# lr: learning rate
# epochs: number of…

emonhossain
- 378
- 2
- 9
6
votes
0 answers
PyTorch Autograd Differentiated Tensors appears to not have been used in the graph
I'm trying to improve a CNN I made by implementing a weighted loss method described in this paper. To do this, I looked into this notebook which implements the pseudo-code of the method described in the paper.
When translating their code to my…

Nakul Upadhya
- 494
- 4
- 16
6
votes
1 answer
Can autograd in pytorch handle a repeated use of a layer within the same module?
I have a layer layer in an nn.Module and use it two or more times during a single forward step. The output of this layer is later inputted to the same layer. Can pytorch's autograd compute the grad of the weights of this layer correctly?
def…

ihdv
- 1,927
- 2
- 13
- 29
6
votes
1 answer
Activation gradient penalty
Here's a simple neural network, where I’m trying to penalize the norm of activation gradients:
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 32, kernel_size=5)
self.conv2…

MichaelSB
- 3,131
- 3
- 26
- 40
6
votes
1 answer
grad_outputs in torch.autograd.grad (CrossEntropyLoss)
I’m trying to get d(loss)/d(input). I know I have 2 options.
First option:
loss.backward()
dlossdx = x.grad.data
Second option:
# criterion = nn.CrossEntropyLoss(reduce=False)
# loss = criterion(y_hat, labels)
# No need to…

aerin
- 20,607
- 28
- 102
- 140
6
votes
1 answer
Pytorch. Can autograd be used when the final tensor has more than a single value in it?
Can autograd be used when the final tensor has more than a single value in it?
I tried the following.
x = torch.tensor([4.0, 5.0], requires_grad=True)
y = x ** 2
print(y)
y.backward()
Throws an error
RuntimeError: grad can be implicitly created…

sbetageri
- 73
- 1
- 7
5
votes
1 answer
I am looking for a comprehensive explanation of the `inputs` parameter of the `.backward()` method in PyTorch
I am having trouble understanding the usage of the inputs keyword in the .backward() call.
The Documentation says the following:
inputs (sequence of Tensor) – Inputs w.r.t. which the gradient will be accumulated into .grad. All other Tensors will…

Shashwat
- 439
- 7
- 13
5
votes
1 answer
Why the grad is unavailable for the tensor in gpu
a = torch.nn.Parameter(torch.ones(5, 5))
a = a.cuda()
print(a.requires_grad)
b = a
b = b - 2
print('a ', a)
print('b ', b)
loss = (b - 1).pow(2).sum()
loss.backward()
print(a.grad)
print(b.grad)
After executing codes, the a.grad is None although…

dddd
- 53
- 3
5
votes
1 answer
Meaning of grad_outputs in PyTorch's torch.autograd.grad
I am having trouble understanding the conceptual meaning of the grad_outputs option in torch.autograd.grad.
The documentation says:
grad_outputs should be a sequence of length matching output containing the “vector” in Jacobian-vector product,…

user118967
- 4,895
- 5
- 33
- 54
5
votes
1 answer
How to break PyTorch autograd with in-place ops
I'm trying to understand better the role of in-place operations in PyTorch autograd.
My understanding is that they are likely to cause problems since they may overwrite values needed during the backward step.
I'm trying to build an example where an…

Luca
- 801
- 1
- 6
- 11
5
votes
0 answers
How to see the size of a computation graph in memory in Pytorch?
How can one trace the memory allocation for the autograd graph created by the forward pass on cpu? For instance trying to use tracemalloc on cpu:
rnn=nn.RNNCell(100,100).to('cuda')
x=torch.ones((1000,100),device='cuda')
tracemalloc.start(25)
while…

Horse
- 361
- 2
- 6