Using pretrained models in Pytorch for Semantic Segmentation, then training only the fully connected layers with our own dataset

Question

I am learning Pytorch and trying to understand how the library works for semantic segmentation. What I've understood so far is that we can use a pre-trained model in pytorch. I've found an article which was using this model in the .eval() mode but I have not been able to find any tutorial on using such a model for training on our own dataset. I have a very small dataset and I need transfer learning to get results. My goal is to only train the FC layers with my own data. How is that achievable in Pytorch without complicating the code with OOP or so many .py files. I have been having a hard time figuring out such repos in github as I am not the most proficient person when it comes to OOP. I have been using Keras for Deep Learning until recently and there everything is easy and straightforward. Do I have the same options in Pycharm? I appreciate any guidance on this. I need to run a piece of code that does the semantic segmentation and I am really confused about many of the steps I need to take.

I don't think there are any fully connected layers in semantic segmentation networks. — akshayk07, Dec 13 '19 at 16:38

conv3d · Answer 1 · 2019-12-15T20:30:42.593

Assume you start with a pretrained model called model. All of this occurs before you pass the model any data.

You want to find the layers you want to train by looking at all of them and then indexing them using model.children(). Running this command will show you all of the blocks and layers.

list(model.children())

Suppose you have now found the layers that you want to finetune (your FC layers as you describe). If the layers you want to train are the last 5 you can grab all of the layers except for the last 5 in order to set their requires_grad params to False so they don't train when you run the training algorithm.

list(model.children())[-5:]

Remove those layers:

layer_list = list(model.children())[-5:]

Rebuild model using sequential:

model_small = nn.Sequential(*list(model.children())[:-5])

Set requires_grad params to False:

for param in model_small.parameters():
    param.requires_grad = False

Now you have a model called model_small that has all of the layers except the layers you want to train. Now you can reattach the layers that your removed and they will intrinsically have the requires_grad param set to True. Now when you train the model it will only update the weights on those layers.

model_small.avgpool_1 = nn.AdaptiveAvgPool2d()
model_small.lin1 = nn.Linear()
model_small.logits = nn.Linear()
model_small.softmax = nn.Softmax()

model = model_small.to(device)

Thanks for the answer but I don't get it fully. I need a bit of elaboration on it. When and how I can feed the data in this model and what reattaching means? — parastoo91, Dec 15 '19 at 16:06
@parastoo91 I updated my answer with some descriptions. See this tut for more info: https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html — conv3d, Dec 15 '19 at 20:31

Using pretrained models in Pytorch for Semantic Segmentation, then training only the fully connected layers with our own dataset

1 Answers1