I am trying to compare a fine-tuned VGGFace model which uses VGGFace weights with a completely retrained model. When I use the fine-tuned model, I get decent accuracy score. However, when I retrain the entire model by unfreezing the weights, the accuracy becomes close to random.
I was guessing whether it is due to small dataset used? I know VGGFace is trained on millions of samples and my dataset only has 1400 samples (700 for each class for a binary classification problem). But I just wanted to be sure if I joined the VGGFace model with my custom model correctly. Could it also be due to learning rate being too fast?
The model is set up using the following codes.
def Train_VGG_Model(train_layers=False):
print('='*65);K.clear_session()
vggface_model=VGGFace(model='vgg16')
x=vggface_model.get_layer('fc7/relu').output
x=Dense(512,name='custom_fc8')(x)
x=Activation('relu',name='custom_fc8/relu')(x)
x=Dense(64,name='custom_fc9')(x)
x=Activation('relu',name='custom_fc9/relu')(x)
x=Dense(1,name='custom_fc10')(x)
out=Activation('sigmoid',name='custom_fc10/sigmoid')(x)
custom_model=Model(vggface_model.input,out,
name='Custom VGGFace Model')
for layer in custom_model.layers:
if 'custom_' not in layer.name:
layer.trainable=train_layers
elif 'custom_' in layer.name:
layer.trainable=True
print('Layer name:',layer.name,
'... Trainable:',layer.trainable)
print('='*65);opt=optimizers.Adam(lr=1e-5)
custom_model.compile(loss='binary_crossentropy',
metrics=['accuracy'],
optimizer=opt')
custom_model.summary()
return custom_model
callbacks=[EarlyStopping(monitor='val_loss',patience=1,mode='auto')]
model=Train_VGG_Model(train_layers=train_layers)
model.fit(X_train,y_train,batch_size=32,epochs=100,
callbacks=callbacks,validation_data=(X_valid,y_valid))
Outputs:
Layer name: input_1 ... Trainable: True
Layer name: conv1_1 ... Trainable: True
Layer name: conv1_2 ... Trainable: True
Layer name: pool1 ... Trainable: True
Layer name: conv2_1 ... Trainable: True
Layer name: conv2_2 ... Trainable: True
Layer name: pool2 ... Trainable: True
Layer name: conv3_1 ... Trainable: True
Layer name: conv3_2 ... Trainable: True
Layer name: conv3_3 ... Trainable: True
Layer name: pool3 ... Trainable: True
Layer name: conv4_1 ... Trainable: True
Layer name: conv4_2 ... Trainable: True
Layer name: conv4_3 ... Trainable: True
Layer name: pool4 ... Trainable: True
Layer name: conv5_1 ... Trainable: True
Layer name: conv5_2 ... Trainable: True
Layer name: conv5_3 ... Trainable: True
Layer name: pool5 ... Trainable: True
Layer name: flatten ... Trainable: True
Layer name: fc6 ... Trainable: True
Layer name: fc6/relu ... Trainable: True
Layer name: fc7 ... Trainable: True
Layer name: fc7/relu ... Trainable: True
Layer name: custom_fc8 ... Trainable: True
Layer name: custom_fc8/relu ... Trainable: True
Layer name: custom_fc9 ... Trainable: True
Layer name: custom_fc9/relu ... Trainable: True
Layer name: custom_fc10 ... Trainable: True
Layer name: custom_fc10/sigmoid ... Trainable: True
=================================================================
Model: "Custom VGGFace Model"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) (None, 224, 224, 3) 0
_________________________________________________________________
conv1_1 (Conv2D) (None, 224, 224, 64) 1792
_________________________________________________________________
conv1_2 (Conv2D) (None, 224, 224, 64) 36928
_________________________________________________________________
pool1 (MaxPooling2D) (None, 112, 112, 64) 0
_________________________________________________________________
conv2_1 (Conv2D) (None, 112, 112, 128) 73856
_________________________________________________________________
conv2_2 (Conv2D) (None, 112, 112, 128) 147584
_________________________________________________________________
pool2 (MaxPooling2D) (None, 56, 56, 128) 0
_________________________________________________________________
conv3_1 (Conv2D) (None, 56, 56, 256) 295168
_________________________________________________________________
conv3_2 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
conv3_3 (Conv2D) (None, 56, 56, 256) 590080
_________________________________________________________________
pool3 (MaxPooling2D) (None, 28, 28, 256) 0
_________________________________________________________________
conv4_1 (Conv2D) (None, 28, 28, 512) 1180160
_________________________________________________________________
conv4_2 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
conv4_3 (Conv2D) (None, 28, 28, 512) 2359808
_________________________________________________________________
pool4 (MaxPooling2D) (None, 14, 14, 512) 0
_________________________________________________________________
conv5_1 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
conv5_2 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
conv5_3 (Conv2D) (None, 14, 14, 512) 2359808
_________________________________________________________________
pool5 (MaxPooling2D) (None, 7, 7, 512) 0
_________________________________________________________________
flatten (Flatten) (None, 25088) 0
_________________________________________________________________
fc6 (Dense) (None, 4096) 102764544
_________________________________________________________________
fc6/relu (Activation) (None, 4096) 0
_________________________________________________________________
fc7 (Dense) (None, 4096) 16781312
_________________________________________________________________
fc7/relu (Activation) (None, 4096) 0
_________________________________________________________________
custom_fc8 (Dense) (None, 512) 2097664
_________________________________________________________________
custom_fc8/relu (Activation) (None, 512) 0
_________________________________________________________________
custom_fc9 (Dense) (None, 64) 32832
_________________________________________________________________
custom_fc9/relu (Activation) (None, 64) 0
_________________________________________________________________
custom_fc10 (Dense) (None, 1) 65
_________________________________________________________________
custom_fc10/sigmoid (Activat (None, 1) 0
=================================================================
Total params: 136,391,105
Trainable params: 136,391,105
Non-trainable params: 0
_________________________________________________________________
Train on 784 samples, validate on 336 samples
Epoch 1/100
784/784 [==============================] - 235s 300ms/step - loss: 0.7987 - accuracy: 0.5051 - val_loss: 0.6932 - val_accuracy: 0.5149
Epoch 2/100
784/784 [==============================] - 233s 298ms/step - loss: 0.6935 - accuracy: 0.4605 - val_loss: 0.6932 - val_accuracy: 0.4792
Epoch 3/100
784/784 [==============================] - 236s 301ms/step - loss: 0.6932 - accuracy: 0.5089 - val_loss: 0.6932 - val_accuracy: 0.4792
280/280 [==============================] - 12s 45ms/step
Thanks in advance and excuse me if my question doesn't make sense. I'm very new to this.