0

I can pull data from an image using pytesseract and obtain the bounding box for the text that it recognises. I would like to be able to plot the bounding boxes on the original image to help a manual checker to confirm that OCR has been carried out correctly (in conjunction with the 'conf' level reported by pytesseract). Can anyone recommend an approach to do this?

Im quite new to coding so i dont really know where to start or even if this is possible with basic Python.

Im using the following to get teh OCR results and the information about the bounding boxes:

pytesseract.image_to_data(Image.open('test.jpg'))

I would like to end up with a modified version of the original image that has the bounding boxes plotted on it.

csabinho
  • 1,579
  • 1
  • 18
  • 28
Dan Peel
  • 1
  • 1
  • You can use OpenCV to draw bounding boxes with `cv2.drawRectangle()`. One approach is to find contours, iterate through contours, then draw the bounding box. Adding an image may help – nathancy Sep 09 '19 at 22:46
  • Thanks - the line was just an extract from a script - 'test.jpg' is the file that im working with. I had been struggling to install CV2 but ive got it working now and your suggested approach does just what i need - thanks again. – Dan Peel Sep 10 '19 at 19:13

0 Answers0