0

I followed some code to do simple text recognition (from How to detect separate figures in an image?). However, it keeps adding addition contours inside my letters like the loops in an e. The code used is:

My attempt to fix this was to test for overlapping contours from a previous iteration. But after labelling each contour I noticed that they are created from lowest y point to the highest as seen in the [output][1]

What is the easiest way to remove the inner contours? I have seen numerous threads referring to the RETR_EXTERNAL (already using) calls and hierarchies, but I don't see how they are applicable to this code.

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87

1 Answers1

1

Have you checked what your contours look like? The reason for multiple rectangles is because your letters or canny edges or the contours become disconnected, thus, there are multiple components per letter in cnts. I would suggest the following:

  1. Print the canny edges and check if they are connected per letter.

    cv2.imshow("canny", canny)

  2. According to what you see above, you can change your blur kernel size to connect disconnected letter components,

    blurred = cv2.GaussianBlur(gray, (9, 9), 0)

  3. or play with your edge detection parameters (lower minval or bigger aperture).

    canny = cv2.Canny(blurred, 80, 255, 3)

  4. You can also apply some morphological operations to connect the disconnected letter components, i.e., dilation or closing.

    kernel = np.ones((5,5),np.uint8) closing = cv2.morphologyEx(canny, cv2.MORPH_CLOSE, kernel)

ilke444
  • 2,641
  • 1
  • 17
  • 31