3

I am trying to do OCR of vehicles such as trains or trucks to identify the numbers and characters written on them. (Please note this is not license plate identification OCR)

I took this image. The idea is to be able to extract the text - BN SF 721 734 written on it.

enter image description here

For pre-processing, I first converted this image to grayscale and then converted it to a binarized image which looks something like this

enter image description here

I wrote some code in tesseract.

myimg = "image.png"
image = Image.open(myimg)
with PyTessBaseAPI() as api:
    api.SetImage(image)
    api.Recognize()
    words = api.GetUTF8Text()
    print words
    print api.AllWordConfidences()

This code gave me a blank output with a confidence value of 95 which means that tesseract was 95% confident that no text exists in this image.

Then I used the setrectangle api in Tesseract to restrict OCR on a particular window within the image instead of trying to do OCR on the entire image.

myimg = "image.png"
image = Image.open(myimg)
with PyTessBaseAPI() as api:
    api.SetImage(image)
    api.SetRectangle(665,445,75,40)
    api.Recognize()
    words = api.GetUTF8Text()
    print words
    print api.AllWordConfidences()
    print "----"

The coordinates 665, 445, 75 and 40 correspond to a rectangle which contains the text BNSF 721 734 in the image. 665 - top, 445- left, 75 - width and 40 - height.

The output I got was this:

an s
m,m

My question is how do I improve the results? I played around with the values in the setrectangle function and the results varied a bit but all of them were equally bad.

Is there a way to improve this?

If you are interested in how I converted the images to binarized images, I used OpenCV

img = cv2.imread(image)
grayscale_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
(thresh, im_bw) = cv2.threshold(grayscale_img, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
thresh = 127
binarized_img = cv2.threshold(grayscale_img, thresh, 255, cv2.THRESH_BINARY)[1]
Piyush
  • 606
  • 4
  • 16
  • 38
  • Try extracting MSER regions using OpenCV. Feed this to Tesseract – Jeru Luke Feb 12 '17 at 07:44
  • @JeruLuke: Will try this option out. I need to read up about MSER regions because I am not sure how they work currently. My question is would it help improve the accuracy of the OCR or would it just help me in extracting the rectangle around the text automatically? Thanks – Piyush Feb 12 '17 at 23:29
  • Try using the Stroke Width Transform to first identify the location of text in the image. It's specifically designed to find text. As a rule, avoid binarizing too early. – Rethunk Feb 13 '17 at 03:38

1 Answers1

0

I suggest finding the contours in your cropped rectangle and setting some parameters to match the contours of your characters. For example: contours with area larger or smaller then some thresholds. Then draw one by one contour on an empty bitmap and perform OCR.

I know it's seems like a lot of work, but it gives you better and more robust results. Good luck!

Monika Bozhinova
  • 294
  • 3
  • 16