0

I have multiple transaction receipts and am trying to extract the invoice amount from each of these receipts. The problem is that the ocr I am using is not being able to capture certain amounts from the document. I have used pillow and pytesseract and pdf2image to convert the pdf documents to images and then used ocr to extract data from these images. I then convert the text into ocr html files to extract data using keywords and locations. However, certain information has not been extracted from the pdf. Please help me solve this.

dbz
  • 411
  • 7
  • 22
Developer
  • 31
  • 1
  • 4
  • Hi @Developer, if you are not able to achieve this with python and you can try with Microsoft OneNote. Just paste the image at a blank page and right-click over it to "Copy the text of this image" and that's it!!! You can paste thousands of images – dbz Jun 11 '19 at 07:35
  • Hi @dbz. Thanks, but i must use python – Developer Jun 11 '19 at 08:57

0 Answers0