I'm trying to code YOLOV1 in Keras, and I use the VGG16 which accepts (224,224,3) while the input is (448, 448, 3) (that's the size used by the authors) so I added a Conv2D and a MaxPooling layer to convert 448 to 224. However, when I use plot_model the input and output shapes of Conv and MaxPooling layers in the middle are all question mark (?).
Here is my code:
base_model = VGG16(
input_shape=backbone_img_shape,
# input_shape=img_shape,
include_top=False,
weights='imagenet')
# pdb.set_trace()
img_input = Input(shape=img_shape)
arch = layers.Conv2D(64, 3, padding='same', activation='relu') (img_input)
arch = layers.MaxPooling2D() (arch)
for i, layer in enumerate(base_model.layers[2:]):
arch = layer(arch)
# arch = base_model(img_input)
arch = layers.Flatten() (arch)
arch = layers.Dense(4096, activation='relu') (arch)
arch = layers.Dropout(0.5) (arch)
arch = layers.Dense((grid_num_per_axis**2)*(num_bbox_per_grid*5 + n_cls)) (arch)
arch = layers.Reshape(
(grid_num_per_axis, grid_num_per_axis, num_bbox_per_grid*5 + n_cls)) (arch)
model = Model(inputs=img_input, outputs=arch)
model.summary()
# pdb.set_trace()
model.compile(
# optimizer=SGD(lr=0.001, momentum=0.9),
optimizer='adam',
loss=yolov1_loss)
tf.keras.utils.plot_model(model, to_file='haha.png', show_shapes=True)
where backbone_img_shape=(224,224,3) and img_shape=(448,448,3). I think there is something wrong because, for example max_pooling2d layer had (?, 224, 224, 64) size but the later layers should have had something similar instead they had (?). Please help me, thank you very much.