How to train new digits handwriting into exist keras model?

Question

I'm having an existing model that trained to recognize handwritten digits. Then I have a new sample digit to train more into that model. Is there any way to do this?

  import os
  import cv2 as cv
  import numpy as np
  import matplotlib.pyplot as plt
  import tensorflow as tf

  ### train model ###
  mnist = tf.keras.datasets.mnist
  (x_train, y_train), (x_test, y_test) = mnist.load_data()
  x_train = x_train/255
  x_test = x_test/255

  model = tf.keras.models.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
  model.add(tf.keras.layers.Dense(units=128,activation=tf.nn.relu))
  model.add(tf.keras.layers.Dense(units=10,activation=tf.nn.softmax))

  model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

  model.fit(x_train, y_train, epochs=5)
  model.save('traineddata.model')

My new sample is number 1 like this:

are you looking to build your won data set? if so, what have you tried? — Innat, May 14 '21 at 10:03
I still not have the approach idea so i need some hint about the produce... — Hung Dang, May 17 '21 at 00:26

furas · Accepted Answer · 2021-05-24T11:06:19.097

You could load image, convert to grayscale, resize to (28,28)
and convert to train array with one example, and use it with fit()

x_example = cv2.imread('image.png')
x_example = cv2.cvtColor(x_example, cv2.COLOR_BGR2GRAY)
x_example = cv2.resize(x_example, (28, 28))

y_example = 1

x_data = np.array( [ x_example ] )  # it has to be array with shape (1, 28, 28) instead of (28, 28)
y_data = np.array( [ y_example ] )  # it has to be array with shape (1, 1) instead of (1,)

model.fit(x_data, y_data, epochs=5)

but it doesn't give good prediction for epochs=5.
For epochs=10 it gives correct prediction for this image but I didn't check if it still gives correct predictions for other images.

Maybe it would be better to add image to x_train, y_train and retrain all again.

x_data = np.append(x_train, [x_example], axis=0)
y_data = np.append(y_train, y_example)

model.fit(x_data, y_data, epochs=5)

It can be like in real life - when you learn new element then you better remeber it then older elements. When you learn again all elements with new element then you refresh all information and you remeber all of them.

Minimal working code which I used for test.

import warnings

warnings.filterwarnings('ignore')  # hide/supress warnings

import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

### train model ###

def build():
    print('-'*50)
    print('# Building model ')
    print('-'*50)
    
    model = tf.keras.models.Sequential()
    model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))
    model.add(tf.keras.layers.Dense(units=128, activation=tf.nn.relu))
    model.add(tf.keras.layers.Dense(units=10, activation=tf.nn.softmax))
    
    model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

    return model
    
def train(model, x_train, y_train, epochs=5):
    print('-'*50)
    print('# Training model')
    print('-'*50)
    
    model.fit(x_train, y_train, epochs=epochs)

def save(model):
    print('-'*50)
    print('# Saving model')
    print('-'*50)

    model.save('traineddata.model')

def load():
    print('-'*50)
    print('# Loading model')
    print('-'*50)
    
    return tf.keras.models.load_model('traineddata.model')

def test_one(model, x_example, y_example):
    print('-'*50)
    print('# Testing one element')
    print('-'*50)
    
    # create array with one or more images
    x_data = np.array( [ x_example ] )
    y_data = np.array( [ y_example ] )

    print('x_data shape:', x_data.shape)
    print('y_data shape:', y_data.shape)
    print(' expected:', y_data)

    # get list with one or more predictions
    y_results = model.predict(x_data)

    print('predicted:', y_results.argmax(axis=1))

def retrain_one(model, x_example, y_example, epochs=5):
    print('-'*50)
    print('# Retraining one element')
    print('-'*50)

    # create array with one or more images
    x_data = np.array( [ x_example ] )
    y_data = np.array( [ y_example ] )

    print('x_data shape:', x_data.shape)
    print('y_data shape:', y_data.shape)
    print('y_data:', y_data)

    model.fit(x_data, y_data, epochs=epochs)

def retrain_all(model, x_train, y_train, x_example, y_example, epochs=5):
    print('-'*50)
    print('# Retraining all elements')
    print('-'*50)

    # create array with all images
    x_data = np.append(x_train, [x_example], axis=0)
    y_data = np.append(y_train, y_example)

    print('x_data shape:', x_data.shape)
    print('y_data shape:', y_data.shape)
    print('y_data:', y_data)

    model.fit(x_data, y_data, epochs=epochs)

# --- main ---

# - load train/test images -

print('>>> Loading train/test data ...')
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

x_train = x_train/255
x_test = x_test/255

# - train + save or load -

if not os.path.exists('traineddata.model'):
    print('>>> Building model ...')
    model = build()

    print('>>> Training model ...')
    train(model, x_train, y_train)

    print('>>> Saving model ...')
    save(model)
else:
    print('>>> Loading model ...')
    model = load()

#print(' - test on single example - ')
#index = 0
#test_one(model, x_train[index], y_train[index])

print(' - image - ')

x_example = cv2.imread('image.png')
x_example = cv2.cvtColor(x_example, cv2.COLOR_BGR2GRAY)
x_example = cv2.resize(x_example, (28, 28))

y_example = 1

print('>>> Predicting without training')

test_one(model, x_example, y_example)

print('>>> Predicting with training one element (epochs=10)')

retrain_one(model, x_example, y_example, epochs=10)  # epochs=5 epochs=7
test_one(model, x_example, y_example)

print('>>> Predicting with retraining all elements')

retrain_all(model, x_train, y_train, x_example, y_example)
test_one(model, x_example, y_example)

#print('>>> Saving new model')
#model.save('traineddata.model')

How about the accuracy you get when training model with ````retraining all elements````? — Hung Dang, May 24 '21 at 00:22
And when i training model, i see the print put like this: ````Epoch 1/10 1876/1876 [==============================] - 27s 9ms/step - loss: 0.4177 - accuracy: 0.8673 - val_loss: 0.0401 - val_accuracy: 0.9859````, can you explain for me what is the number 1876 means? — Hung Dang, May 24 '21 at 00:26
`1876` is number of examples/rows in training data. If you use data `x_train, y_train` then see `len(x_train)` or `len(y_train)` and you should get `1876` — furas, May 24 '21 at 11:07
if i ````print(len(x_train))````, it show ````60009```` so idk why it's alway ````1876```` :| — Hung Dang, May 25 '21 at 01:56
check directly before `fit(x_train)`, If you use different variable `fit(other)` then check `len(other)`. When I run code I see always `60000/60000`. Maybe you use different values in `fit()` — furas, May 25 '21 at 12:17
Here is my ````fit````: ````model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=20)````, with ````X_train/=255```` and ````X_test/=255```` — Hung Dang, May 26 '21 at 00:23
if you divide `60000/1875` then you get exactly `32`. In [documentation](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit) I found that default value for `batch_size` is `32`. It seems it shows number of batches where batch has size `32`. IIf you use `fit(..., batch_size=1)` then you get `60000`. In older version it was showing `60000` with default setting. — furas, May 26 '21 at 01:39
For normally, when i only use dataset of MNIST, the number is just ````1875````, then i add about 8 number images from 10,....17, but why it just ````1876````? — Hung Dang, May 26 '21 at 01:44
`(60000+10)//32` gives `1876`. In documentation you have `batch` is `Number of samples per gradient update.` It defines how often it update weight in neural network. — furas, May 26 '21 at 01:46
I tried with ````batch_size=1````, and it show ````60011```` :D — Hung Dang, May 26 '21 at 01:47

How to train new digits handwriting into exist keras model?

1 Answers1