0

I want to extract features using a pretrained CNN model(ResNet50, VGG, etc) and use the features with a CTC loss function.

I want to build it as a text recognition model.

Anyone on how can i achieve this ?

1 Answers1

0

I'm not sure if you are looking to finetune the pretrained models or to use the models for feature extraction. To do the latter freeze the petrained model weights (there are several ways to do this in PyTorch, the simplest being calling .eval() on the model), and feed the logits from the last layer of the model to your new output head. See the PyTorch tutorial here for a more in depth guide.

  • I am using it for feature extraction. I am using keras. I am freezing all the model layers and then added two dense layers. Second one is with softmax. After this i am compiling the model using CTC loss function as described here: https://keras.io/examples/image_ocr/ But there is an error showing up: Dimension must be 2 but is 3 for 'ctc_3/transpose_2' (op: 'Transpose') with input shapes: [?,82], [3]. – Adesh Gautam Apr 15 '20 at 10:23