I am trying to perform OCR with tesseract. I can do pdf to text using tesseract java lib as expected. My requirements is extended a bit now. I need to extract metadata based on template form (kind of passport example where we have fix place for first name, date of birth etc). Input could be either pdf or image with same template form.
I am facing hard time to find any such example or article to achieve or to get further help above using tesseract.
So my basic questions :
- Is this possible using tesseract?
- Is there any example/articles about how to achieve this using tesseract?
- Is there any other software/library which is recommended to achieve this?
Thanks for reading this.