I am trying to train a simple MobileNetV3Small
under keras.applications
as shown below
base_model = keras.applications.MobileNetV3Small(
input_shape= INPUT_SHAPE,
alpha=.125,
include_top=False,
classes=1,
dropout_rate = 0.2,
weights=None)
x = keras.layers.Flatten()(base_model.output)
preds = keras.layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs=base_model.input, outputs=preds)
model.compile(loss="binary_crossentropy",
optimizer='RMSprop',
metrics=["binary_accuracy"])
train_datagen = ImageDataGenerator(
rescale=1.0 / 255,
rotation_range=40,
horizontal_flip=True,
vertical_flip=True,
)
train_generator = train_datagen.flow_from_directory(
os.path.join(DATA_ROOT, 'train'),
target_size=(56,56),
batch_size=128,
class_mode="binary",
)
validation_datagen = ImageDataGenerator(rescale=1.0 / 255)
validation_generator = validation_datagen.flow_from_directory(
os.path.join(DATA_ROOT, 'val'),
target_size=(56,56),
batch_size=128,
class_mode="binary",
)
model_checkpoint_callback = keras.callbacks.ModelCheckpoint(
filepath=SAVE_DIR,
save_weights_only=True,
monitor='val_binary_accuracy',
mode='max',
save_best_only=True)
es_callback = keras.callbacks.EarlyStopping(patience=10)
model.fit(train_generator,
epochs=100,
validation_data=validation_generator,
callbacks=[model_checkpoint_callback, es_callback],
shuffle=True)
When I train the model I got validation accuracy around 0.94. But when I call model.evaluate
on the exact same validation data, the accuracy becomes 0.48. When I call model.predict
with any data it outputs constant value 0.51...
There is nothing wrong with learning rate, optimizer or metrics. What could be wrong here?
EDIT:
After training when I run
pred_results = model.evaluate(validation_generator)
print(pred_results)
it gives me the output for 1 epoch trained network:
6/6 [==============================] - 1s 100ms/step - loss: 0.6935 - binary_accuracy: 0.8461
However, when I save and load the model with either model.save()
or tf.keras.models.save_model()
. The output becomes something like this:
6/6 [==============================] - 2s 100ms/step - loss: 0.6935 - binary_accuracy: 0.5028 [0.6935192346572876, 0.5027709603309631]
and output of the model.predict(validation_generator)
is:
[[0.5080832] [0.5080832] [0.5080832] [0.5080832] . . . [0.5080832] [0.5080832]]
What I've tried so far:
- Used
tf.keras.utils.image_dataset_from_directory()
instead ofImageDataGenerator
- Fixed tensorflow and numpy seeds globally.
- Found similar problem in another SO post, and decreased
momentum
parameter of MobileNet BatchNormalization layers one by one.
for layer in model.layers[0].layers:
if type(layer) is tf.keras.layers.BatchNormalization:
layer.momentum = 0.9
First two moves do not have an effect, the after applying the third step, I get no longer same predictions for any input. However, evaluate()
and predict()
still have different accuracy values.