What is the proper to save the fitted CNN model for MNIST dataset?

Question

I develpoed a simple CNN model for MNIST dataset and i got 98% validation accuracy. But after saving the model through keras as model.h5 and evaluating the inference of th saved model in another jypyter session, the performance of the model is poor and the predictions are random

What needs to be done to get same accuracy after saving and uploading the model in different jypyter notebook session?

score 0 · Answer 1 · answered Dec 01 '22 at 17:14

(Consider sharing your code/results so the community can help you better).

I'm assuming you're using Tensorflow/Keras, so model.save('my_model.h5') after your model.fit(...) should save the model, including the trained parameters (but not including the internal optimizer data; i.e gradients, etc..., which shouldn't affect the prediction capabilities of the model).

A number of things could cause a generalization gap like that, but...

Case 1: having a high training/validation accuracy and a low test (prediction) accuracy typically means your model overfit on the given training data.

I suggest adding some regularization to your training phase (dropout layers, cutout augmentation, L1/L2, etc...), a fewer number of epochs or early-stopping, or cross-validation/data reshuffle to cross off the possibility of overfitting.
Case 2: low intrinsic dataset variance, but unless you're using a subset of MNIST, this is unlikely. Make sure you are properly splitting your training/validation/test sets.

Again, it could be a number of issues, but these are the most common cases for low model generalization. Post your code (specifying the architecture, optimizer, hyperparameters, data prepropcessing, and test data used) so the answers can be more relevant to your problem.

What is the proper to save the fitted CNN model for MNIST dataset?

1 Answers1