0

How do we use OCR technology to extract numbers from the following format enter image description here

I tried easyOCR and Tesseract and they fail when we have such boxes. If the numbers are typed (not handwrtten) these boxes still come out to be a problem, bc they perform well without these boxes generally

What would be a nice way to extract from these boxes, given at times these boxes can be contigious and connected to each other .

Is there some significant work done around this,because I think data extraction from documents should be a common problem

Thanks

Code :

import easyocr
def DL_OCR(path):
  reader = easyocr.Reader(['en'])
  result = reader.readtext(path)
  string = ""
  for x in result:
    string+=x[1]+" "
  return string
Sadaf Shafi
  • 1,016
  • 11
  • 27

0 Answers0