0

I'm trying to code YOLOV1 in Keras, and I use the VGG16 which accepts (224,224,3) while the input is (448, 448, 3) (that's the size used by the authors) so I added a Conv2D and a MaxPooling layer to convert 448 to 224. However, when I use plot_model the input and output shapes of Conv and MaxPooling layers in the middle are all question mark (?).

enter image description here

Here is my code:

   base_model = VGG16(
        input_shape=backbone_img_shape,
        # input_shape=img_shape,
        include_top=False,
        weights='imagenet')
    # pdb.set_trace()
    img_input = Input(shape=img_shape)
    arch = layers.Conv2D(64, 3, padding='same', activation='relu') (img_input)
    arch = layers.MaxPooling2D() (arch)
    for i, layer in enumerate(base_model.layers[2:]):
        arch = layer(arch)
    # arch = base_model(img_input)
    arch = layers.Flatten() (arch)
    arch = layers.Dense(4096, activation='relu') (arch)
    arch = layers.Dropout(0.5) (arch)
    arch = layers.Dense((grid_num_per_axis**2)*(num_bbox_per_grid*5 + n_cls)) (arch)
    arch = layers.Reshape(
        (grid_num_per_axis, grid_num_per_axis, num_bbox_per_grid*5 + n_cls)) (arch)
    model = Model(inputs=img_input, outputs=arch)
    model.summary()
    # pdb.set_trace()
    model.compile(
        # optimizer=SGD(lr=0.001, momentum=0.9),
        optimizer='adam',
        loss=yolov1_loss)
    tf.keras.utils.plot_model(model, to_file='haha.png', show_shapes=True)

where backbone_img_shape=(224,224,3) and img_shape=(448,448,3). I think there is something wrong because, for example max_pooling2d layer had (?, 224, 224, 64) size but the later layers should have had something similar instead they had (?). Please help me, thank you very much.

Dang Manh Truong
  • 795
  • 2
  • 10
  • 35

0 Answers0