3

We are working on a term project and we are using MNIST training set. Although our classifier works well on MNIST test set (>94% accuracy), its performance is significantly low on our prepared dataset. Details of our prepared dataset as follows:

  • We created 28x28 images using paint program.
  • The background of the images we prepared is black, digits are drawn with white (same as MNIST).
  • When we check a MNIST image and our prepared image, they look really same.

Regarding the pixel values, we try different combinations and

  • We map all pixel values from (0, 255) to (0,1) continuous.
  • We map all pixel values from (0, 255) to (0, 1) binary where only digit pixels equal to 1.

The best score on our prepared dataset is approximately 70% whereas MNIST test set performance is always high (>94%). Moreover, classifier makes some very strange mistakes such as it predicts 3 --true digit is 0.

Anyone familiar with the MNIST? I think problem is related to pixel values, but I didn't figure out why it happens. When I use imshow both of the images look exactly the same.

Grzegorz Adam Kowalski
  • 5,243
  • 3
  • 29
  • 40
Baskaya
  • 7,651
  • 6
  • 29
  • 27
  • I have the same problem at the moment. How do you evaluate the accuracy? Only the highest value or the value must be over 0.5 or sth. like that? You might want to center the numbers, I had big problems with numbers that are shifted by two pixel. A bit code and some images would be nice. – Wikunia Apr 26 '15 at 10:19
  • You might need to center the images using "center_of_mass" – Wikunia Apr 26 '15 at 12:06
  • Interesting question! Did you in the meanwhile get better performances, for ex. due to changing preprocessing? – Ruthger Righart Nov 22 '16 at 14:24
  • Make sure that you don't use the test set as a training set. This is a common mistake that results in overfitting (the network fails to generalize). – 6infinity8 Aug 21 '17 at 08:46

0 Answers0