I want to extract features using a pretrained CNN model(ResNet50, VGG, etc) and use the features with a CTC loss function.
I want to build it as a text recognition model.
Anyone on how can i achieve this ?
I want to extract features using a pretrained CNN model(ResNet50, VGG, etc) and use the features with a CTC loss function.
I want to build it as a text recognition model.
Anyone on how can i achieve this ?
I'm not sure if you are looking to finetune the pretrained models or to use the models for feature extraction. To do the latter freeze the petrained model weights (there are several ways to do this in PyTorch, the simplest being calling .eval() on the model), and feed the logits from the last layer of the model to your new output head. See the PyTorch tutorial here for a more in depth guide.