0

I'm trying to solve face anti-spoofing problem by using pre-trained model (e.g., VGG trained on ImageNet). Where do I need to retrieve the features ? after which layer ? More specific, is it enough to change the output of the last full connected layer from 2622 to 2 as in the face anti-spoofing problem, we have two classes (real/fake)?

actually, is it efficient to use a pre-trained VGG-face model (which trained on ImageNet) in face anti-spoofing problem? And please any tutorial or either GitHub code help me to achieve this in Python?

Aj.h
  • 1
  • 1

1 Answers1

1

Maybe too late to answer but better late than never.

It depends on your dataset if you have too few or too many samples. Generally, a pre-trained model is suggested when you have a limited amount of data and/or want to avoid overfitting while extracting most of the features of your samples for higher accuracy. If you are using Keras try to go with VGG16:

conv_net = VGG16(weights="imagenet", 
                 include_top=False,
                 input_shape=(150, 150, 3)) # Change the shape accordingly

It gives you such a layers stack:

Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 150, 150, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 150, 150, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 150, 150, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 75, 75, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 75, 75, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 75, 75, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 37, 37, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 37, 37, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 37, 37, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 18, 18, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 18, 18, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 18, 18, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 9, 9, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 9, 9, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 4, 4, 512)         0         
=================================================================
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0

To use this model you have two choices, one is to extract the features using only this model and save them on disk, and in the next step create your Densely connected layers and feed the output of the previous step to the model. This approach is much faster than the next one I'm going to explain but the only drawback is you can't use data augmentation. This is how you can extract the features using the predict method of conv_net:

features_batch = conv_base.predict(inputs_batch)
# Save the features in a tensor and feed them to the Dense Layer after all has been extracted

The second choice is to attach your Densely connected model at top of the VGG model, Freez the conv_net layers and feed your data normally to the network, this way you can use data augmentation but only use it when you have access to a powerful GPU or Cloud. Here is a code of how to freeze and connect your Dense layer on top of VGG:

#codes adopted from "Deep Learning with Python" book
from keras import models
from keras import layers
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

You can even fine tune the model by unfreezing one of the layers of conv_net to adapt with your data. Here is how you can freeze all layers but one:

conv_base.trainable = True
set_trainable = False
for layer in conv_base.layers:
    if layer.name == 'block5_conv1':
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else:
        layer.trainable = False
# your model like before

Hope it helps you get started.

Mohammad Siavashi
  • 1,192
  • 2
  • 17
  • 48