0

I m using pytesseract.image_to_data() on this image: enter image description here

code to create Bounding Box:

import pytesseract
from pytesseract import Output
import cv2
img = cv2.imread('Page_2.jpg')

d = pytesseract.image_to_data(img, output_type=Output.DICT)
n_boxes = len(d['level'])
for i in range(n_boxes):
    (x, y, w, h) = (d['left'][i], d['top'][i], d['width'][i], d['height'][i])
    cv2.rectangle(img, (x, y), (x + w, y + h), (0, 255, 0), 2)


cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

I m getting bounding box on each word, like this:

enter image description here

Is there any way to get meaningful word like 'invoice number' in a single bounding box??? like this:

enter image description here

Jeru Luke
  • 20,118
  • 13
  • 80
  • 87
aditya
  • 33
  • 1
  • 1
  • 5

0 Answers0