1

I am trying to implement a simple gradient descent for linear regression with pytorch as shown in this example in the docs:

import torch
from torch.autograd import Variable

learning_rate = 0.01
y = 5
x = torch.tensor([3., 0., 1.])
w = torch.tensor([2., 3., 9.], requires_grad=True)
b = torch.tensor(1., requires_grad=True)

for z in range(100):
    y_pred = b + torch.sum(w * x)
    loss = (y_pred - y).pow(2)
    loss = Variable(loss, requires_grad = True)
    # loss.requires_grad = True
    loss.backward()
    
    with torch.no_grad():
        w = w - learning_rate * w.grad
        b = b - learning_rate * b.grad
        
        w.grad = None
        b.grad = None

When I run the code I get the error RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

I have read here and here that it could be solved

using

  • loss = Variable(loss, requires_grad = True) results in TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

  • or loss.requires_grad = True results in RuntimeError: you can only change requires_grad flags of leaf variables.

How can I fix this?

OuttaSpaceTime
  • 710
  • 10
  • 24

1 Answers1

4

This error was actually caused by mixing calculation functions from torch with python built ins (same should go with numpy or other libraries which are not torch). It actually means that the autograd implementation from torch is breaking because they don't work with other functions.

a good explanation can be read here


This is more like a not fully appropiate hack:

calling .retain_grad() before backward solved the issue for me:

learning_rate = 0.01
y = 5
x = torch.tensor([3., 0., 1.])
w = torch.tensor([2., 3., 9.], requires_grad=True)
b = torch.tensor(1., requires_grad=True)

for z in range(100):
    y_pred = b + torch.sum(w * x)
    loss = (y_pred - y).pow(2)
    
    w.retain_grad()
    b.retain_grad()
    loss.backward()
    
    w = w - learning_rate * w.grad
    b = b - learning_rate * b.grad
OuttaSpaceTime
  • 710
  • 10
  • 24
  • 1
    This didn't work for me, still getting same error :/ – Kitwradr Nov 28 '22 at 07:43
  • @Kitwradr you could try to debug by checking if `requires_grad` is set correctly on everything you want to propagate. and also to which function it is pointing. – OuttaSpaceTime Nov 28 '22 at 09:34
  • How do I check to which function it is pointing? – Kitwradr Nov 28 '22 at 20:46
  • Inspect the tensors in debugging console, not sure right now what the attribute names are. Depending on your used IDE you might be able to inspect directly. Elsewise check the backwards implementation documentation, there should be all the details. Feel free to update answer afterwards or post your own answer :) – OuttaSpaceTime Nov 29 '22 at 12:04