Plotting activation function gradients in PyTorch?

Question

Using a pytorch model I want to plot the gradients of loss with respect to my activation functions (e.g. ReLU). For the non-activation layers I can get gradients as follows but for the activation functions I cannot do that. How can I plot my activation functions. I need them to check for vanishing/exploding gradients problem. Thanks for your help.

I want to plot the gradients of my activation functions using torch to achieve something like the screenshot below.

# Given a model defined like this how to get grads of torch.nn.ReLU() ?
model = torch.nn.Sequential(torch.nn.Linear(1,2), torch.nn.ReLU(), torch.nn.Linear(2,1), torch.nn.tanh())

# For non-activation layers I can do
print(model[0].grad)

# But I cannot do
model[1].grad

How can I get gradients for ReLU and tanh ?

IJokl · Accepted Answer · 2023-05-23T20:39:51.473

ReLU(x)=max(0,x) so the gradient is 0 if x<0, 1 if x>0 and undefined when x =0 so you could just take the output from the relu layer and plot the corresponding value. see here

edit:

this is how you can get certain layers outputs, and after that manually compute the gradients.

option one: You could create a nn.module class instead your torch.nn.sequential to store and return your activation gradients. something like:

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = torch.nn.Linear(1,2)
        self.relu = torch.nn.ReLU()
        self.fc2 =  torch.nn.Linear(2,1)
        self.tanh =  torch.nn.tanh()

    def forward(self, x):
        out = {}
        x = self.relu(self.fc1(x))
        out['relu_out'] = x.detach()
        x = self.tanh(self.fc2(x))
        out['tanh_out'] = x.detach()
        return x,out

option two:

is using forward hooks

when I do loss = loss_fn(model(x), target), and loss.backward() how can I do as you suggested ? — cangozpi, May 23 '23 at 19:36
edited my answer with options to get the activation layers output — IJokl, May 23 '23 at 20:14

Plotting activation function gradients in PyTorch?

1 Answers1