1

I'm trying to use keras neural network of tensorflow to recognize the handwriting digit number. But idk why when i call predict(), it returns same results for all of input images.

Here is code:

  ### Train dataset ###
  mnist = tf.keras.datasets.mnist
  (x_train, y_train), (x_test, y_test) = mnist.load_data()
  x_train = x_train/255
  x_test = x_test/255

  model = tf.keras.models.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
  model.add(tf.keras.layers.Dense(units=128,activation=tf.nn.relu))
  model.add(tf.keras.layers.Dense(units=10,activation=tf.nn.softmax))

  model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])

  model.fit(x_train, y_train, epochs=5)

The result looks like this:

Epoch 1/5
1875/1875 [==============================] - 2s 672us/step - loss: 0.2620 - accuracy: 0.9248
Epoch 2/5
1875/1875 [==============================] - 1s 567us/step - loss: 0.1148 - accuracy: 0.9658
Epoch 3/5
1875/1875 [==============================] - 1s 559us/step - loss: 0.0784 - accuracy: 0.9764
Epoch 4/5
1875/1875 [==============================] - 1s 564us/step - loss: 0.0596 - accuracy: 0.9817
Epoch 5/5
1875/1875 [==============================] - 1s 567us/step - loss: 0.0462 - accuracy: 0.9859

Then the code to use image to test is below:

  img = cv.imread('path/to/1.png')
  img = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
  img = cv.resize(img,(28,28))
  img = np.array([img])
    
  if cv.countNonZero((255-image)) == 0:
     print('')
  img = np.invert(img)
    
  plt.imshow(img[0])
  plt.show()
    
  prediction = model.predict(img)
  result = np.argmax(prediction)
  print(prediction)
  print(f'Result: {result}')

The result is:

Input with number 1

plt show: PlT show 1

[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
Result: 3

Input with number 2

plt show PlT show 2

[[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]]
Result: 3
Hung Dang
  • 55
  • 5
  • You are inverting your image when predicting `img = np.invert(img)` whereas the original dataset have number as higher value (so white digit (1) on black background(0)). Similarly the `img/255`. Do the same processing you do for training images on your test images/while predicting. – SajanGohil May 14 '21 at 07:47
  • I see that if i remove the second ````img = np.invert(img)````, it return predictions with hight percent exactly, but if remove all of the ````img = np.invert(img)````, it give all wrong – Hung Dang May 14 '21 at 07:50
  • what second `np.invert(img)` are you talking about? there is only one. `cv.countNonZero((255-image))` doesn't invert the actual image if that's what you are talking about – SajanGohil May 14 '21 at 07:53
  • i checked my code again and see the second ````np.invert(img)````, but the ````cv.countNonZero((255-image))```` just use to check if image is a blank white background. :D – Hung Dang May 14 '21 at 07:55
  • @HungDang can you give some feedback on the given answer (updated part)? – Innat May 17 '21 at 07:36
  • As i say, i just have to remove the second ````np.invert(img)````, and it works fine – Hung Dang May 17 '21 at 08:10

1 Answers1

1

Normalize your data in inference time same what you did on the training set

img = np.array([img]) / 255

Check this answer (Inference) for more details.


Based on your 3rd comment, here are some details.

def input_prepare(img):            
    img = cv2.resize(img, (28, 28))   
    img = cv2.bitwise_not(img)   

    img = tf.cast(tf.divide(img, 255) , tf.float64)              
    img = tf.expand_dims(img, axis=0)   
    return img 

img = cv2.imread('/content/1.png')
orig = img.copy() # save for plotting later on 

img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # gray scaling 
img = input_prepare(img)

plt.imshow(tf.reshape(img, shape=[28, 28]))

enter image description here

plt.imshow(cv2.cvtColor(orig, cv2.COLOR_BGR2RGB))
plt.title(np.argmax(model.predict(img)))
plt.show()

enter image description here

It works as expected. But because of resizing the image, the digits get broken and lose their spatial information. That seems ok for the model but if it gets much worse, then the model will predict wrong. A case examples

enter image description here

and the model predicts wrong for this.

plt.imshow(cv2.cvtColor(orig, cv2.COLOR_BGR2RGB))
plt.title(np.argmax(model.predict(img)))
plt.show()

enter image description here

To fix this we can apply cv2.erode to add some pixel after resizing, for example

def input_prepare(img):            
    img = cv2.resize(img, (28, 28))   
    img = cv2.erode(img, np.ones((2, 2)))
    img = cv2.bitwise_not(img)   

    img = tf.cast(tf.divide(img, 255) , tf.float64)              
    img = tf.expand_dims(img, axis=0)   
    return img 

enter image description here

Not the best approach perhaps but now the model will understand better.

plt.imshow(cv2.cvtColor(orig, cv2.COLOR_BGR2RGB))
plt.title(np.argmax(model.predict(img)))
plt.show()

enter image description here

Innat
  • 16,113
  • 6
  • 53
  • 101
  • so the ````img = np.invert(img)```` not helpful, right? – Hung Dang May 14 '21 at 07:46
  • normally in `mnist`, the background is black, and the foreground (or digit itself) is white. You just need to ensure the same view in the test (inference set) time. – Innat May 14 '21 at 08:19
  • The image cotains digits has white backgound and black foreground, so i use ````np.invert(img)```` to invert the background and foreground. I also check my code and i accidenttaly call ````np.invert(img)```` twice so the final image to predict has white background, the i clear the second, it work fine, but number 1 still not correct...:( – Hung Dang May 14 '21 at 08:23
  • Have you normalized data in inference time as I mentioned above? Also, can you post your original image too? – Innat May 14 '21 at 08:27
  • Follow that answer link that I mentioned in this answer, it will surely solve the problem. – Innat May 14 '21 at 08:28
  • 1
    I tried, but nothing change, i've already post orginal image in topic – Hung Dang May 14 '21 at 08:32
  • It works fine, now i trying to train it with digits greater than 9, like 10 to 17, but look like the predict not return true value – Hung Dang May 17 '21 at 08:11
  • do you have a dummy dataset regarding digits above 10? – Innat May 17 '21 at 10:25