How do we use OCR technology to extract numbers from the following format
I tried easyOCR and Tesseract and they fail when we have such boxes. If the numbers are typed (not handwrtten) these boxes still come out to be a problem, bc they perform well without these boxes generally
What would be a nice way to extract from these boxes, given at times these boxes can be contigious and connected to each other .
Is there some significant work done around this,because I think data extraction from documents should be a common problem
Thanks
Code :
import easyocr
def DL_OCR(path):
reader = easyocr.Reader(['en'])
result = reader.readtext(path)
string = ""
for x in result:
string+=x[1]+" "
return string