What does a good Dataset for Number Recognition look like

Question

I am trying to create an iOS App, that is able to recognize numbers from 0 to 9, when getting an image from the devices camera. I started out by detecting the number, which in my case will always be in a blue circle. I managed to get a pretty accurate circle detection with OpenCV. The app will at this point take the image, scan it for blue circles, cut it to the part, where the circle is, turn it into black and white and higher the contrast, so that there is just pure black (the background) and pure white (the number). The result is a pretty clear image of just the number. The last step would be to recognize the image with a simple Image Classifier.

So I tried to recreate such "white number on black background" - images for a dataset. I used images of the numbers with the same font, as in reality, added random contrast, random brightness, random scale, added a blue circle and gave it to the function in OpenCV, which then gave back an image, that I saved on my hard drive. The dataset I created had over 10.000 images per number (so over 100.000 in total). I then used CreateML to train an Image Classifier for that dataset. The accuracy in the actual app with actual photos of such numbers however is quite bad.

So I tried a different approach. The idea was to change everything of the images except the number, so that the model learns the similarities. I did this by adding random white and black pixels to the image, rotating and scaling it. In the end I applied the same black and white filter from OpenCV and saved the images on my hard drive. This model is even worse than the above.

You can find a sample image of both datasets here: https://1drv.ms/f/s!Ao1FRfDXc7vKklCxq3n7NC6APImP

So here are my Questions:

1) Shouldn't it be quite easy to create a Machine-Learning Model that is capable of recognizing the numbers with a high accuracy?

2) What should my dataset look like in this case, to optimize the model accuracy?

3) How many images per number would you recommend for training?

score 0 · Answer 1 · answered Dec 28 '18 at 16:00

I think the question should be asked at the OpenCV questions website (http://answers.opencv.org/questions/) here are some tips and help.

0) Start by doing some more research. This is a very standard problem and often used as introductory by the state of the art. Here are some clues/examples

1) Maybe, but you'll have to explore non-opencv territory.

http://caffe.berkeleyvision.org/gathered/examples/mnist.html

2) Large, full of samples, well categorized/labeled and augmented if possible.

https://www.kaggle.com/gimunu/data-augmentation-with-keras-into-cnn#

3) This a largely empirical field. How diverse is your validation set? Will the application run on a controlled environment? Is noise really an issue?

Try starting with a small dataset (10.000 images like MNIST), check the precision and build-up from there.

What does a good Dataset for Number Recognition look like

1 Answers1