4

I am developing a Siamese Network for Face Recognition using Keras for 224x224x3 sized images. The architecture of a Siamese Network is like this:

Structure of Siamese Network

For the CNN model, I am thinking of using the InceptionV3 model which is already pretrained in the Keras.applications module.

#Assume all the other modules are imported correctly

from keras.applications.inception_v3 import InceptionV3

IMG_SHAPE=(224,224,3)

def return_siamese_net():

  left_input=Input(IMG_SHAPE)
  right_input=Input(IMG_SHAPE)

  model1=InceptionV3(include_top=False, weights="imagenet", input_tensor=left_input) #Left SubConvNet
  model2=InceptionV3(include_top=False, weights="imagenet", input_tensor=right_input) #Right SubConvNet

  #Do Something here

  distance_layer = #Do Something
  prediction = Dense(1,activation='sigmoid')(distance_layer) # Outputs 1 if the images match and 0 if it does not

  siamese_net = #Do Something  
  return siamese_net

model=return_siamese_net()
  

I get error since the model is pretrained, and I am now stuck at implementing the Distance Layer for the Twin Network.

What should I add in between to make this Siamese Network work?

Sarath
  • 163
  • 5
  • 14

1 Answers1

1

A very important note, before you use the distance layer, is to take into consideration that you have only one convolutional neural network.

The shared weights actually refer to only one convolutional neural network, and the weights are shared because the same weights are used when passing a pair of images (depending on the loss function used) in order to compute the features and subsequently the embeddings of each input image.

You would have only one neural network, and the block logic will need to look like:

def euclidean_distance(vectors):
    (features_A, features_B) = vectors
    sum_squared = K.sum(K.square(features_A - features_B), axis=1, keepdims=True)
    return K.sqrt(K.maximum(sum_squared, K.epsilon()))


image_A = Input(shape=...)
image_B = Input(shape=...)
feature_extractor_model = get_feature_extractor_model(shape=...)
features_A = feature_extractor(image_A)
features_B = feature_extractor(image_B)
distance = Lambda(euclidean_distance)([features_A, features_B])
outputs = Dense(1, activation="sigmoid")(distance)
siamese_model = Model(inputs=[image_A, image_B], outputs=outputs)

Of course, the feature extractor model can be a pretrained network from Keras/TensorFlow, with the output classification layer improved.

The main logic should be like the one above, of course, if you want to use triplet loss, that would require three inputs (Anchor, Positive, Negative), but for the beginning I would recommend to stick to the basics.

Also, it would a good idea to consult this documentation:

  1. https://www.pyimagesearch.com/2020/11/30/siamese-networks-with-keras-tensorflow-and-deep-learning/
  2. https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d
Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • Are there any ways by which I can let the neural network learn the similarity of images instead of using similarity functions like cosine similarity, euclidean distance or L1 distance? – Sarath Jan 10 '21 at 07:33
  • 1
    Since these are part of the One-Shot Learning, the distance functions are different from the typical functions with cross-entropy etc. There is also a contrastive loss, but lately the most used is the cosine similarity distance. – Timbus Calin Jan 10 '21 at 08:01
  • I used your idea in my model and it worked but another issue came now. If you may, can you please look into this issue? Thanks in Advance https://stackoverflow.com/questions/65644501/why-does-my-cnn-not-predict-labels-as-expected – Sarath Jan 10 '21 at 08:12
  • 1
    I answered your question there – Timbus Calin Jan 10 '21 at 08:23