Different accuracy by fit() and evaluate() in Keras with the same dataset

Question

I program Keras's code to train GoogleNet. However, accuracy gotten from fit() is 100% yet with the same training dataset used for evaluate(), accuracy remains 25% only, which has such huge discrepancy!!! Also, accuracy by evaluate(), which is not like fit(), won't get improved for training more times, which means it almost stays in 25%.

Does anyone has idea of what is wrong with this situation?

# Training Dataset and labels r given. Here load GoogleNet model
from keras.models import load_model
model = load_model('FT_InceptionV3.h5')
# Training Phase
model.fit(x=X_train, 
              y=y_train, 
              batch_size=5, 
              epochs=20, 
              validation_split=0,
              #callbacks=[tensorboard]
             )

#Testing Phase
train_loss , train_acc=model.evaluate(X_train, y_train, verbose=1)
print("Train loss=",train_loss,"Train accuracy",train_acc)

Training Result

Testing Result

@Justice_Lords to see if the model is overfitting he/she should evaluate in the test or dev set, that's not the case. He/she is cheking the training phase acc. I dont think this will work but you can try to predict the train set and use `accurarcy_score` from scikit-learn for example, just to see if the evaluate function has a bug or something. — OSainz, Mar 11 '19 at 08:05
This might help (e.g. setting proper seeds etc.): https://machinelearningmastery.com/reproducible-results-neural-networks-keras/ — Matt, Mar 11 '19 at 08:12
ConvNets refusing to learn is in most cases either due to improper scaling/preprocessing of your input images or having the wrong learning_rate for the optimizer (e.g. LR too big or too small). Hope that helps. — Matt, Mar 11 '19 at 08:14
@Justice_Lords Though it might have overfitting, won't accuracy by fit() and evaluate() be the same if the same dataset r used? — Hiro, Mar 11 '19 at 08:42

score 4 · Answer 1 · answered Mar 11 '19 at 13:31

After some digging into Keras issues, I found this.

The reason for this is that when you use fit, At each batch of the training data the weights are updated. The loss value returned by the fit method is not the mean of the loss of the final model, but the mean of the loss of all slightly different models used on each batch.

On the other hand, when you use to evaluate, the same model is used on the whole dataset. And this model actually doesn't even appear in the loss of the fit method since even at the last batch of training, the loss computed is used to update the model's weights.

To sum everything up, fit and evaluate have two completely different behaviours.

Reference:-

Different accuracy by fit() and evaluate() in Keras with the same dataset

1 Answers1

Linked