2

I am using a finetuned VGG16 model using the pretrained 'VGGFace' weights to work on Labelled Faces In the Wild (LFW dataset). The problem is that I get a very low accuracy, after training for an epoch (around 0.0037%), i.e., the model isn't learning at all.

I think it has got to do something with my architecture. My architecture is like this:

vgg_x = VGGFace(model = 'vgg16', weights = 'vggface', input_shape = (224,224,3), include_top = False)
last_layer = vgg_x.get_layer('pool5').output
x = Flatten(name='flatten')(last_layer)
x = Dense(4096, activation='relu', name='fc6')(x)

out = Dense(311, activation='softmax', name='fc8')(x)
custom_vgg_model = Model(vgg_x.input, out)

custom_vgg_model.compile(optimizer = keras.optimizers.Adam(), loss = 
keras.losses.categorical_crossentropy, metrics = ['accuracy'])

kfold = KFold(n_splits = 15,random_state = 42)
kf = kfold.get_n_splits(X_train)

for train_index,test_index in kfold.split(X_train):
    X_cross_train = X_train[train_index]
    X_cross_test = X_train[test_index]
    Y_cross_train = y_train[train_index]
    Y_cross_test = y_train[test_index]
    custom_vgg_model.fit(x = X_cross_train,y = Y_cross_train, batch_size = 32, epochs = 10,verbose = 2, validation_data = (X_cross_test,Y_cross_test))

I expect the model to learn atleast if not get a great accuracy. What could be the problem ? Is there something wrong with my architecture or anything else ?

Preprocessing step shouldn't be wrong, but just in case:

image_set_x = keras_vggface.utils.preprocess_input(image_set_x, version=1)
Amruth Lakkavaram
  • 1,467
  • 1
  • 9
  • 12

1 Answers1

3

Try training with a smaller learning rate than the default one (for instance, 1e-4). The random weights from the classification layer can bring about large gradient updates. These will cause large weight updates in the lower layers and basically destroy the pretrained weights in the convolutional base.

In addition, you can use the ReduceLROnPlateau callback to further decrease the learning rate when validation accuracy stops increasing.

Another strategy to avoid large disruptive gradient updates is to freeze the weights in the convolutional base first, pre-train the classification layers, then finetune the entire stack with a small learning rate. This approach is explained in detail in the Keras blogpost on transfer learning: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html

sdcbr
  • 7,021
  • 3
  • 27
  • 44
  • Small doubt: We are freezing the weights of convolutional base right, then how is this destroying the convolutional base pretrained weights ? – Amruth Lakkavaram Jan 23 '19 at 09:45
  • How are you freezing them? I don't find that in your code – sdcbr Jan 23 '19 at 09:46
  • vgg_x = VGGFace(model = 'vgg16', weights = 'vggface', input_shape = (224,224,3), include_top = False) for layer in vgg_x.layers[:-4]: layer.trainable = False last_layer = vgg_x.get_layer('pool5').output – Amruth Lakkavaram Jan 23 '19 at 11:12
  • I later unfreezed hoping it would give better result. When I use default learning rate of Adam, I get accuracy of 0.027% – Amruth Lakkavaram Jan 23 '19 at 11:13
  • @AmruthLakkavaram dont use Adam, use SGD with a very small learning rate first. Adam works pretty well for GANs, outside that, I myself had very bad luck. – Hossein Aug 14 '19 at 15:05