Text Recognition through AWS Rekognition Fails to Detect Majority of Text

Question

I am using AWS Rekognition to detect text from a pdf that is converted into a jpeg. The image that I am using has text that is approximately size 10-12 or a regular letter page. However, The font changes throughout the image several times.

Is my lack of detection and low confidence levels due to having a document where the text changes often? Small Font?

Essentially I'd like to know what kind of image/text do I need to have the best results from a detect text algorithm?

Mausam Sharma · Accepted Answer · 2020-06-29T03:38:27.767

3

DetectText API can detect up to 50 words in an image

and to be detected, text must be within +/- 30 degrees orientation of the horizontal axis.

and you are trying to extract a page full of text, that's the problem :)

AWS now provides AWS Textract service that is specifically intended for OCR purposes from images and documents.

edited Jun 29 '20 at 03:38

answered May 17 '18 at 13:20

Mausam Sharma

852
5
10

Text Recognition through AWS Rekognition Fails to Detect Majority of Text

1 Answers1