2020.06.09
There are 700 images for training, each of them extract 64 rois and make a mini-batch, when batch-size is set to 2, it cast 350 steps to complete training, but for RCNN, each target is extracted as a single image resized to 224*224, there will be 64*700=44800 images, each of which contains more information and features than a 7*7 pooled feature map and I guess that's why it seems under-fitting though RCNN could be train well on same data.
==========================================================================
Use fully balanced data, and acc drops to 0.53 (training data)
[0.5233287 0.4766713] not plane
[0.5281736 0.4718264] not plane
[0.53316545 0.46683457] not plane
[0.5287853 0.4712147] not plane
[0.52475226 0.47524777] not plane
[0.5293444 0.4706556] not plane
[0.52849627 0.47150376] not plane
[0.52786124 0.4721388 ] not plane
[0.52475226 0.47524777] not plane
[0.53224194 0.4677581 ] not plane
[0.5313732 0.4686268] not plane
[0.528143 0.47185704] not plane
[0.5233287 0.4766713] not plane
[0.5233839 0.4766161] not plane
[0.525427 0.47457302] not plane
[0.51949245 0.48050752] not plane
[0.52733606 0.47266394] not plane
[0.5268566 0.47314337] not plane
[0.52158654 0.47841352] not plane
[0.5412768 0.45872322] not plane
[0.5277719 0.47222808] not plane
[0.5223139 0.4776861] not plane
[0.5289101 0.47108996] not plane
[0.5207478 0.47925228] not plane
[0.52475226 0.47524777] not plane
[0.53407675 0.46592325] not plane
[0.53204036 0.4679596 ] not plane
[0.52786124 0.4721388 ] not plane
[0.52574503 0.47425497] not plane
[0.5271339 0.47286615] not plane
[0.5224281 0.4775719] not plane
[0.5233839 0.4766161] not plane
[0.5196227 0.48037735] not plane
[0.52554363 0.47445634] not plane
[0.52554363 0.47445634] not plane
[0.5446083 0.45539168] not plane
[0.53676397 0.46323603] not plane
[0.53944343 0.46055657] not plane
[0.520972 0.479028] not plane
[0.5492453 0.45075467] not plane
[0.52860624 0.47139376] not plane
[0.5273249 0.4726751] not plane
[0.52752113 0.4724789 ] not plane
[0.52902967 0.47097033] not plane
[0.5307333 0.46926668] not plane
[0.5322479 0.46775213] not plane
[0.53944343 0.46055657] not plane
[0.5499064 0.4500937] not plane
[0.5403881 0.4596119] not plane
[0.5203569 0.47964308] not plane
[0.52871954 0.47128052] not plane
[0.53245085 0.46754912] not plane
[0.5324656 0.4675344] not plane
[0.519246 0.48075405] not plane
[0.5299878 0.47001216] not plane
[0.527601 0.47239903] not plane
[0.5228142 0.4771858] not plane
[0.53725046 0.46274957] not plane
I think this network was just guessing but not learning...
==========================================================================
2020.06.08
I follow this structure used in many repos in GitHub, but the acc wont improve:
def build_model():
pooled_square_size = 7
num_rois = 32
roi_input = Input(shape=(num_rois, 4), name="input_2")
model_cnn = tf.keras.applications.VGG16(
include_top=True,
weights='imagenet'
)
x = model_cnn.layers[17].output
x = RoiPoolingConv(pooled_square_size, roi_input.shape[1])([x, roi_input])
x = TimeDistributed(Flatten())(x)
x = TimeDistributed(Dense(4096, activation='selu'))(x)
x = TimeDistributed(Dropout(0.5))(x)
x = TimeDistributed(Dense(4096, activation='selu'))(x)
x = TimeDistributed(Dropout(0.5))(x)
x = TimeDistributed(Dense(2, activation='softmax', kernel_initializer='zero'))(x)
model_final = Model(inputs=[model_cnn.input, roi_input], outputs=x)
opt = Adam(lr=0.0001)
model_final.compile(
loss=tf.keras.losses.CategoricalCrossentropy(),
optimizer=opt,
metrics=["accuracy"]
)
model_final.save("TrainedModels" + slash + "FastRCNN.h5")
Training Logs:
100/100 [==============================] - ETA: 0s - loss: 0.5556 - accuracy: 0.7681
Epoch 00001: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 41s 412ms/step - loss: 0.5556 - accuracy: 0.7681
Epoch 2/100
100/100 [==============================] - ETA: 0s - loss: 0.5223 - accuracy: 0.7910
Epoch 00002: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 41s 414ms/step - loss: 0.5223 - accuracy: 0.7910
Epoch 3/100
100/100 [==============================] - ETA: 0s - loss: 0.5340 - accuracy: 0.7797
Epoch 00003: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 42s 416ms/step - loss: 0.5340 - accuracy: 0.7797
Epoch 4/100
100/100 [==============================] - ETA: 0s - loss: 0.5309 - accuracy: 0.7825
Epoch 00004: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 43s 427ms/step - loss: 0.5309 - accuracy: 0.7825
Epoch 5/100
100/100 [==============================] - ETA: 0s - loss: 0.5257 - accuracy: 0.7840
Epoch 00005: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 43s 434ms/step - loss: 0.5257 - accuracy: 0.7840
Epoch 6/100
100/100 [==============================] - ETA: 0s - loss: 0.5181 - accuracy: 0.7928
Epoch 00006: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 42s 423ms/step - loss: 0.5181 - accuracy: 0.7928
Epoch 7/100
100/100 [==============================] - ETA: 0s - loss: 0.5483 - accuracy: 0.7712
Epoch 00007: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 42s 418ms/step - loss: 0.5483 - accuracy: 0.7712
Epoch 8/100
100/100 [==============================] - ETA: 0s - loss: 0.5282 - accuracy: 0.7832
Epoch 00008: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 43s 429ms/step - loss: 0.5282 - accuracy: 0.7832
Epoch 9/100
100/100 [==============================] - ETA: 0s - loss: 0.5385 - accuracy: 0.7765
Epoch 00009: saving model to TrainedModels\FastRCNN.h5
Reference:
==========================================================================
I wrote a bi-classification model for detect airplane in photos based on Fast-RCNN, train datasets are generated by Selective Search, when I use datasets with a Negative/Positive ratio about 1, the model could only have about 0.6 acc on train datasets, when I make N/P ratio higher and closer to its origin ratio generated by Selective Search, the train acc can reach 0.9 but it performs badly when used to predict test datasets. During training, the train acc always be the same after epoch completes, when I use TensorBoard, I see weights of layers do not change after epochs: TensorBoard Histogram of Weights
This is the basic structure of of my model, the feature extraction was VGG16 and output a 28*28 feature map to ROI Pooling layer, I try to change activation from ReLu to SeLu, but it didn't work: Model Structure
These are the input image and its feature maps (28*28*512) before and after (32*14*14*512) ROI Pooling layer:
One Typical Feature Map of ROI After ROI_P
Another Typical Feature Map of ROI After ROI_P
and I used this code to generate this model:
def build_model():
num_rois = 32
roi_input = Input(shape=(num_rois, 4), name="input_2")
model_cnn = tf.keras.applications.VGG16(
include_top=True,
weights='imagenet'
)
x = model_cnn.layers[13].output
x = RoiPoolingConv(pooled_square_size, roi_input.shape[1])([x, roi_input])
for layer in model_cnn.layers[15:]:
x = TimeDistributed(layer)(x)
x = TimeDistributed(Dense(512, activation='sigmoid'))(x)
x = TimeDistributed(Dense(2, activation='softmax'))(x)
model_final = Model(inputs=[model_cnn.input, roi_input], outputs=x)
opt = Adam(lr=0.0001)
model_final.compile(
loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=opt,
metrics=["accuracy"]
)
model_final.save("TrainedModels" + slash + "FastRCNN.h5")
The full code can be seen here: Github Repo
I've tried add BatchNormalization, adjust LR, or simply add more layers, but the model do not improve a bit at all, I eagerly look forward to somebody who can tell me the key flaw in this model so I can have further test on it, Thank you!
I highly suspect this VGG16 has something weird:
This is an input image:
This is its corresponding output feature map