10

I would like to apply OCR to some pictures of 7 segment displays on a wall. My strategy is the following:

  1. Covert Img to Grayscale
  2. Blur img to reduce false edges
  3. Threshold the img to a binary img
  4. Apply Canny Edge detection
  5. Set Region of Interest (ROI) base on a pattern given by the silhouette of the number
  6. Scale ROI and Template match the region

How to set a ROI so that my program doesn't have to look for the template through the whole image? I would like to set my ROI base on the number of edges found or something more useful if someone can help me.

I was looking into Cascade Classification and Haar but I don't know how to apply it to my problem.

Here is an image after being pre-processed and edge detected: an image after being pre-processed and edge detected

original Image

enter image description here

locorecto
  • 1,178
  • 3
  • 13
  • 40

2 Answers2

3

If this is representative of the number of edges you'll have to deal with you could try a nice naive strategy like sliding a ROI-finder window across the binary image which just sums the pixel values, and doesn't fire unless that value is above a threshold. That should optimise out all the blank surfaces.

Edit: Ok some less naive approaches. If you have some a-priori knowledge, like you know the photo is well aligned (and not badly rotated or skewed), you could do some passes with a low-high-low-high grate tuned to capture the edges either side of a segment, using different scales in both x and y dimensions. A good hit in both directions will give clues not only about ROI but what scale of template to begin with (too large and too small grates won't hit both edges at once).

You could do blob detection, and then apply your templates to blobs in turn (falling back on merging blobs if the template matching score is below a threshold, in case your number segment is accidentally partitioned). The size of the blob might again give you some hint as to the scale of template to apply.

dabhaid
  • 3,849
  • 22
  • 30
  • This was one of my original plans. I would like to see if there is something out there less naive that could helped me. – locorecto Feb 21 '12 at 20:53
0

First of all, given that the original image has a LED display and so the illuminated region is has a higher intensity than the trest, I'd perform say a Yuv colour transformation on the original image and then work with the intensity plane (Y).

Next, if you know that the image is well aligned (i.e. not rotated) I would suggest applying separate horizontal and vertical edge detectors rather than a generic edge detector (you are not interested in diagonal lines). E.g.

 sobelx = cv2.Sobel( img, cv2.CV_64F, 1, 0, ksize=5 )
 sobely = cv2.Sobel( img, cv2.CV_64F, 0, 1, ksize=5 )

Otherwise you might use contour detection to find the bounds of the digits (though you may need to perform a dilate to close the gaps between LED segments.

Next I would construct horizontal and vertical histograms of the output from these edge or contour detections. These would help you to identify 'busy' regions of the image which contain many edges.

Finally, I'd threshold the Y plane and explore each of the ROIs with my template.

Dave Durbin
  • 3,562
  • 23
  • 33