1

I am trying to do binary classification using transfer learning using Timm
In the process, I want to experiment with freezing/unfreezing different layers of different architectures but so far, I am able to freeze/unfreeze entire models only. Can anyone help me in illustrating it with a couple of model architectures for the sake of heterogeneity of different architectures?
Below, I am ilustrating the entire freezing of couple of architectures using Timm - convnext and resnet but can anyone illustrate me with any different models but only using Timm(As it is more comprehensive than Pytorch model zoo)-

import timm
convnext = timm.create_model('convnext_tiny_in22k', pretrained=True,num_classes=2) 
resnet = timm.create_model('resnet50d', pretrained=True,num_classes=2)
Beginner
  • 721
  • 11
  • 27

1 Answers1

0

Here is how I do it:

import timm
model = timm.create_model('resnet50', pretrained=True, num_classes=2)
for name, param in model.named_parameters():
    if name == 'conv1.weight' or name == 'bn1.weight' or name == 'bn1.bias':
        param.requires_grad = False
    if 'layer1' in name:
        param.requires_grad = False
    print(name, param.requires_grad)

In the above code, I froze the weights of the first layer (out of four). There is a thin convolution layer in the input, which is why I freeze it separately.

In general, when you print the model, you can see the names of the layers. Then you can set .requires_grad=True or .requires_grad=False for each layer depending on your use case.

Moobie
  • 1,445
  • 14
  • 21