0

I want to extract the text on Labels from the images. The images are coloured and are in a real-life environment. PFA images. Sample Image

I have tried multiple solutions:

  1. I'm able to read text from flat images using Tesseract but it's not working if the text is at a certain angle.
  2. Tried a lot of image pre-processing converting it to Binary and grayscale but not able to extract the required text.
  3. Since the above step failed I was not able to de-skew the text either.
    image = cv2.imread("p18-73.png",0)
    thresh = cv2.adaptiveThreshold(image,255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11,2)
    coords = np.column_stack(np.where(thresh > 0))
    angle = cv2.minAreaRect(coords)[-1]

The above pre-processing code is not working. Can you please tell me what is the best way to approach this image?

Zhubei Federer
  • 1,274
  • 2
  • 10
  • 27
Katya047
  • 43
  • 1
  • 8
  • Use [google vision](https://cloud.google.com/vision/) API if you can afford it, there's also a free trial. It's cheap and simple solution. – Toni Sredanović May 28 '19 at 11:03

1 Answers1

0

Did you check the result of cv2.adaptiveThreshold()? The result of cv2.adaptiveThreshold() is like this:

Adaptive Threshold Result

I think this is not what you want. Try to use global threshold cv2.threshold(), and adjust the threshold value.

ret, thresh = cv2.threshold(image, 240, 255, cv2.THRESH_BINARY)

Global Threshold Result

Also, you can add cv2.morphologyEx() to remove the noise.

kernel = np.ones((2,2),np.uint8)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
cel16
  • 119
  • 4