I am building a CNN pn Pytorch using the pre-trained DenseNet121 model. I am replacing the classifier of the pre-trained model with my classifier. I tried to do it in two ways. While the first one worked, the second one gave the abovementioned error while training. I need to use the second one to add attention to the model. Why is the second one giving an error when both are the same?
First code which is working correctly
model = models.densenet121(pretrained=True)
for param in model.parameters():
param.requires_grad = False
classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(1024, 512)),
('relu', nn.ReLU()),
('fc2', nn.Linear(512, 10)),
('output', nn.LogSoftmax(dim=1))
]))
model.classifier = classifier
Second code which is giving error while training
net = models.densenet121(pretrained=True)
for param in net.parameters():
param.requires_grad = False
class AttnDenseNet121(nn.Module):
def __init__(self, num_classes, normalize_attn=False, dropout=None):
super(AttnDenseNet121, self).__init__()
self.features = net.features
self.classifier = nn.Sequential(OrderedDict([
('fc1', nn.Linear(1024, 512)),
('relu', nn.ReLU()),
('fc2', nn.Linear(512, 10)),
('output', nn.LogSoftmax(dim=1))
]))
def forward(self, x):
x = self.features(x)
out = self.classifier(x)
return out
model = AttnDenseNet121(num_classes=10, normalize_attn=True)
Training code is same for both and batch size = 32