This is my very first attempt at using Python. I normally use .NET, but to identify shapes in documents have turned to Python and OpenCV for image processing.
I am using OpenCV TemplateMatching (cv2.matchTemplate) to discover Regions of Interest (ROI) in my documents.
This works well. The template matches the ROI's and rectangles are placed, identifying the matches.
The ROI's in my images contain text which I also need to OCR and extract. I am trying to do this with Tesseract, but I think I am approaching it wrongly, based upon my results.
My process is this:
- Run cv2.matchTemplate
- Loop through matched ROI's
- Add rectangle info. to image
- Pass rectangle info. to Tesseract
- Add text returned from tesseract to image
- Write the final image
In the image below, you can see the matched regions (which are fine), but you can see that the text in the ROI doesn't match the text from tesseract (bottom right of ROI).
Please could someone take a look and advise where I am going wrong?
import cv2
import numpy as np
import pytesseract
import imutils
img_rgb = cv2.imread('images/pd2.png')
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('images/matchMe.png', 0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.45
loc = np.where(res >= threshold)
for pt in zip(*loc[::-1]):
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
roi = img_rgb[pt, (pt[0] + w, pt[1] + h)]
config = "-l eng --oem 1 --psm 7"
text = pytesseract.image_to_string(roi, config=config)
print(text)
cv2.putText(img_rgb, text, (pt[0] + w, pt[1] + h),
cv2.FONT_HERSHEY_SIMPLEX, 1.2, (0, 0, 255), 3)
cv2.imwrite('images/results.png', img_rgb)