0

I am trying to learn image processing and i set myself an objective in that regard. A Number Recognition System.

and so i took An Introduction to Digital Image Processing with Matlab by Alasdair McAndrew, and learned about things such as Edge Detection, Thresholding, Dilation and erosion and hit and miss transform.

Now the problem is that i am having a hard time visualizing how these tools are going to help me in my objective.

I have also a number of other books on image processing and while everyone of them teaches how to erode or dilate nobody tells me what to do if i want to recognize an object,number or character in an image.

It has been really frustrating because looking up on the web gives very general and broad answers to this question.

Can anybody tell me how to recognize a number from an image that i have made myself in paint using these techniques.

If not at least can suggest me a book or even a field because after looking at a number of books i am getting the impression that i am looking in wrong direction.

Win Coder
  • 6,628
  • 11
  • 54
  • 81

1 Answers1

1

There is too many approaches to OCR, it is probably difficult to recommend anything specific. But currently there is a number of free OCR systems available, you can download some of them and look on how they approach character recognition. The opensource projects that I've seen are the following:

gocr http://jocr.sourceforge.net/

clara-ocr http://www.claraocr.org/

cuneiform https://launchpad.net/cuneiform-linux (Opensourced commercial ocr engine)

tesseract http://code.google.com/p/tesseract-ocr/ (Opensourced commercial ocr engine) .

The usual approach for the advanced ocr engines is to combine several methods of character recognition simultaneously and then to use some kind of voting mechanism to select the best match for character.

Usually all the engines starts from the clusterization of image to split the text into individual characters. Then multiple algorithm are run in attempt to recognize each character. For example, cuneiform ocr engine uses a) feature detection (like number of strokes in the character), this is where the dilation/etc. stuff is useful b) downsampling the image of character to 15x15 size and then applying neural network like recognizer. c) multiple ad hoc rules for specific characters.

I think, starting from neural network or some other classifier (e.g. linear classifier or support vector machine classifier) is the best idea to try and to get quick results.

So in you place I would start from the simple character segmentation algorithm + train simple neural network/linear/svm classifier on the database of digit images. Large databases of images of handwritten digits are available from NIST.

begemotv2718
  • 868
  • 6
  • 14
  • The book i am following An Introduction to Digital Image Processing with Matlab by Alasdair McAndrew doesn't have any information on for example how to split text into individual characters or finding the number of strokes in a character. Would you recommend a book where i can find this kind of information ? – Win Coder Mar 20 '13 at 14:42
  • The algorithm I had in mind is a Connected component labeling algorithm, it seems to be explained in the book Shapiro, L., and Stockman, G. (2002). Computer Vision. Prentice Hall, which is available online http://www.cse.msu.edu/~stockman/Book/2002/Chapters/ch3.pdf – begemotv2718 Mar 20 '13 at 17:41