0

so im doing a manual calculation of lstm backpropagation in excel and want to compare it to my code, but im having trouble with the gradient of sigmoid at the pytorch. :

the output here:

tensor([[0.8762]], grad_fn=<SigmoidBackward>)
tensor([-0.1238])
epoch:   0 loss: 0.13214068

so the first line is the sigmoid value and the second line is the gradient of sigmoid value. why the value of sigmoid gradient is -0.1238 while the formula of sigmoid gradient are σ(x)⋅(1−σ(x). if i calculate the sigmoid gradient manually the value is 0.10845, but in the code the sigmoid gradient is -0.1238 .is the formula for the sigmoid gradient in pytorch wrong?

1 Answers1

2

I can't reproduce your error. For one thing, your value of 0.10845 is not correct: remember that it might be computed that way (i.e. z * (1 - z)) because you expect z to be logistic(z) in your implementation. But, in any case, the value I compute agrees with what PyTorch produces.

Here's the logistic function:

import numpy as np

def logistic(z, derivative=False):
    if not derivative:
        return 1 / (1 + np.exp(-z))
    else:
        return logistic(z) * (1 - logistic(z))
    
logistic(0.8762, derivative=True)

This produces 0.20754992931590668.

Now with PyTorch:

import torch

t = torch.Tensor([0.8762])
t.requires_grad = True
torch.sigmoid(t).backward()
t.grad

This produces tensor([0.2075]).

Matt Hall
  • 7,614
  • 1
  • 23
  • 36