2

I am using a gru function to implement a RNN. This RNN (GRU) is used after some CNN layers. Can someone please tell me what is the input to a GRU function here? Especially, is the hidden size fixed?

self.gru = torch.nn.GRU(
            input_size=input_size,
            hidden_size=128,
            num_layers=1,
            batch_first=True,
            bidirectional=True) 

According to my understanding the input size will be the number of features and the hidden size for GRU is always fixed as 128? Can some one please correct me. Or give their feedback

tolearntoseek
  • 55
  • 1
  • 5

1 Answers1

3

First, GRU is not a function but a class and you are calling its constructor. You are creating an instance of class GRU here, which is a layer (or Module in pytorch).

The input_size must match the out_channels of the previous CNN layer.

None of the parameters you see is fixed. Just put another value there and it will be something else, i.e. replace the 128 with anything you like.

Even though it is called hidden_size, for a GRU this parameter also determines the output features. In other words, if you have another layer after GRU, this layer's input_size (or in_features or in_channels or whatever it is called) must match the GRU's hidden_size.

Also, have a look at the documentation. This tells you exactly what the parameters you pass to the constructor are good for. Also, it tells you what will be the expected input once you actually use your layer (via self.gru(...)) and what will be the output of that call.

sebrockm
  • 5,733
  • 2
  • 16
  • 39