I am a bit new at Deep learning and image classification. I want to extract features from an image using VGG16 and give them as input to my vit-keras model. Following is my code:
from tensorflow.keras.applications.vgg16 import VGG16
vgg_model = VGG16(include_top=False, weights = 'imagenet', input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3))
for layer in vgg_model.layers:
layer.trainable = False
from vit_keras import vit
vit_model = vit.vit_b16(
image_size = IMAGE_SIZE,
activation = 'sigmoid',
pretrained = True,
include_top = False,
pretrained_top = False,
classes = 2)
model = tf.keras.Sequential([
vgg_model,
vit_model,
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation = tfa.activations.gelu),
tf.keras.layers.Dense(256, activation = tfa.activations.gelu),
tf.keras.layers.Dense(64, activation = tfa.activations.gelu),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(1, 'sigmoid')
],
name = 'vision_transformer')
model.summary()
But, I'm getting the following error:
ValueError: Input 0 of layer embedding is incompatible with the layer: expected axis -1 of input shape to have value 3 but received input with shape (None, 8, 8, 512)
I'm assuming this error occurs at the merging of VGG16 and vit-keras. How will rectify this error for this situation?