0

I am getting confused about the input shape to GRU layer. I have a batch of 128 images and I extracted 9 features from each images. So now my shape is (1,128,9).

This is the GRU layer

gru=torch.nn.GRU(input_size=128,hidden_size=8,batch_first=True)

Question 1: Is the input_size=128 correctly defined?
Here is the code of forward function

def forward(features):
    features=features.permute(0,2,1)#[1, 9, 128]
    x2,_=self.gru(features)

Question 2: Is the `code in forward function is correctly defined?

Thanks

Talha Anwar
  • 2,699
  • 4
  • 23
  • 62

1 Answers1

0

No, input_size is not correctly defined. Here, input_size means the number of features in a single input vector of the sequence. The input to the GRU is a sequence of vectors, each input being a 1-D tensor of length input_size. In case of batched input, the input to GRU is a batch of sequence of vectors, so the shape should be (batch_size, sequence_length, input_size) when batch_first=True otherwise the expected shape is (sequence_length, batch_size, input_size) when batch_first=False

import torch

batch_size = 128 
input_size = 9  # features in the input
seq_len = 5 # seqence length - how many input vectors in one sequence

hidden_size = 20 # the no of fetures in the output of GRU

gru=torch.nn.GRU(input_size=input_size,hidden_size=hidden_size,batch_first=True)

X = torch.rand( (batch_size, seq_len, input_size), dtype = torch.float32 )
print(f'{X.shape=}')

Y,_ = gru(X)
print(f'{Y.shape=}')

output

"""
X.shape=torch.Size([128, 5, 9])
Y.shape=torch.Size([128, 5, 20])
"""

Using batch_first=False

gru=torch.nn.GRU(input_size=input_size,hidden_size=hidden_size,batch_first=False)

X = torch.rand( (seq_len, batch_size, input_size), dtype = torch.float32 )
print(f'{X.shape=}')

Y,_ = gru(X)
print(f'{Y.shape=}')

output

"""
X.shape=torch.Size([5, 128, 9])
Y.shape=torch.Size([5, 128, 20])
"""