3

I am working on a text recognition project which needs to detect and recognise text from the image. There are two short lines text in the image (320px * 320 px). The first line is the abbreviation of country code. The second line is the dialling code. The whole image can be rotated in an arbitrary angle. Below are some examples.

image one

image one

image two

image two

image three

image three

Because the text is very short, method like hough transform (detect long line), fourier transform and profile projection cannot perform well. I am using contour detection to detect the angle of text block. However, it cannot work well if the text block is triangular. Moreover, text will become upside down, left-side down and right-side down after de-skewing if the text block is rectangular. Can somebody suggest?

file = r"/home/hank/Desktop/af_36.jpg"
image = cv2.imread(os.path.normpath(file))
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

blur = cv2.GaussianBlur(gray, (3, 3), 0)
_, thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
dilation = cv2.dilate(thresh, kernel, iterations=1)

contours, hierarchy = cv2.findContours(dilation, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

contours = [contours[i] for i in range(len(contours)) if
                 not (hierarchy[0][i][3] >= 0 and hierarchy[0][i][2] == -1)]

angles = []

for cnt in contours:
    rect = cv2.minAreaRect(cnt)
    angles.append(rect[2])

angle = sum(angles)/len(angles)

print(angle)
nathancy
  • 42,661
  • 14
  • 115
  • 137
Hank
  • 173
  • 3
  • 7

1 Answers1

1

How about You dont detect text, try to detect the space in between 2 texts(up and down).

(1) most ez way.

threshold the image to find text( with word =1 without word =0. Then find Center point of the threholded box. The middle point x and y should be empty space.

Try to rotate a line(same length) centered at middle points the width jsut nicely touch top and bottom text. The result that has the maximum amount of non-zero pixel(means line is not overlapping with text) that have 1 should be the angle that text is in.

enter image description here

(2)Use old face detection routing. Use a harr like pattern with template matching at N rotation angle.

For loop for all x, y , angle

Then gradually refine.

E.g This is angle 0 version of the harr feature. align this with image by template matching. then align rotated patten and add on top of the previous angle template matching image. Concatenate all template matching result and run a min-max to find the highest return

enter image description here

Dr Yuan Shenghai
  • 1,849
  • 1
  • 6
  • 19
  • thanks for the information. (1) There might be no space between two lines sometimes. (2) Moreover, the text will come in different fonts. Can template matching still solve this problem? – Hank May 25 '19 at 11:59
  • font no issue. no space then u need more type of haar. – Dr Yuan Shenghai May 25 '19 at 16:07