0

I need to train an image classifier using inception V3 model from Keras. The images pass through 5 Conv2D layers and 2 MaxPool2D layers before entering the pre-trained inception V3 model. However my code gives me an error of ValueError: Depth of input (64) is not a multiple of input depth of filter (3) for 'inception_v3_4/conv2d_123/convolution' (op: 'Conv2D') with input shapes: [?,2,2,224], [3,3,3,32]

I reckon my output shape from previous layers is not compatible with the input shape required by Inception. But i am not able to solve it or is it even possible to solve this error. I am a beginner in machine learning and any light in this matter will be greatly appreciated.

My code is as follows:

inception_model = inception_v3.InceptionV3(weights='imagenet', include_top = False)

for layer in inception_model.layers:
    layer.trainable = False

input_layer = Input(shape=(224,224,3)) #Image resolution is 224x224 pixels
x = Conv2D(128, (7, 7), padding='same', activation='relu', strides=(2, 2))(input_layer)
x = Conv2D(128, (7, 7), padding='same', activation='relu', strides=(2, 2))(x)
x = Conv2D(64, (7, 7), padding='same', activation='relu', strides=(2, 2))(x)
x = MaxPool2D((3, 3), padding='same',strides=(2, 2))(x)
x = Conv2D(64, (7, 7), padding='same', activation='relu', strides=(2, 2))(x)
x = Conv2D(64, (7, 7), padding='same', activation='relu', strides=(2, 2))(x)
x = MaxPool2D((4, 4), padding='same', strides=(2, 2))(x)

x = inception_model (x) #Error in this line
x = GlobalAveragePooling2D()(x)

predictions = Dense(11, activation='softmax')(x) #I have 11 classes of image to classify

model = Model(inputs = input_layer, outputs=predictions)

model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['acc'])
model.summary()
Nihar Kashyap
  • 40
  • 2
  • 8
  • What is a purpose of the first few layers? They basically scale your input down to dimension ` (None, 2, 2, 64) ` add least the feature map should have at most 3 values. However, even in that case the input would be almost useless. Hence the question about the purpose of these layers – CAFEBABE May 20 '20 at 20:55

1 Answers1

1

Like @CAFEBABE said it would be almost useless to do this because the feature map can have almost 3 values but if you still want to try it then you can do this:

x = Conv2D(3, (7, 7), padding='same', activation='relu', strides=(2, 2))(input_layer)

Another thing you will have to remember is that like you used 5 Conv2D and 2 MaxPooling layers above but you can't do that because even in the Inception model there are Conv2D and max-pooling layers which will take the dimensions to negative and give an error. I tried with 2 Conv2D layers and got an error so at max you can use 1.

Also when you are specifying InceptionV3 model specify the input shape.

input_layer = Input(shape=(224,224,3)) #Image resolution is 224x224 pixels
x = Conv2D(128, (7, 7), padding='same', activation='relu', strides=(2, 2))(input_layer)

inception_model = tf.keras.applications.InceptionV3(weights='imagenet', include_top = False, input_shape=x.shape[0])
for layer in inception_model.layers:
    layer.trainable = False

x = inception_model (x)
x = GlobalAveragePooling2D()(x)

predictions = Dense(11, activation='softmax')(x) #I have 11 classes of image to classify

model = Model(inputs = input_layer, outputs=predictions)
model.compile(optimizer=Adam(lr=0.001), loss='categorical_crossentropy', metrics=['acc'])
model.summary()

This would work but I doubt it will help the model. Anyways try it who knows what will happen.

Vardan Agarwal
  • 2,023
  • 2
  • 15
  • 27
  • @CAFEBABE I was trying to implement the architecture as shown in this paper [link](https://spj.sciencemag.org/plantphenomics/2019/9237136/) . Am i doing it the wrong way? – Nihar Kashyap May 21 '20 at 03:20