I am trying to write some simple neural network using pytorch. I am new to this library. I faced with two ways of implementing the same idea: a layer with some fixed activation function (e.g. tanh).
The first way to implement it:
l1 = nn.Tanh(n_in, n_out)
The second way:
l2 = nn.Linear(n_in, n_out) # linear layer, that do nothing with its input except summation
but in forward propagation use:
import torch.nn.functional as F
x = F.tanh(l2(x)) # x - value that propagates from layer to layer
What are the differences between those mechanisms? Which one is better for which purposes?