How to improve output of tesseract OCR

Asked Jul 27 '23 at 10:02

Active Jul 27 '23 at 11:28

Viewed 71 times

I am trying to detect text from aadhar card(ID) using Tesseract OCR but i am getting incomplete result.

like not detecting 'Government of India' which is available on the top of the id and in some cases not detecting the 'name' and 'gender'

I have tried to get the complete text by applying image preprocessing techniques

cv2.adaptiveThreshold(gray_image,255,cv2.ADAPTIVE_THRESH_MEAN_C,cv2.THRESH_BINARY,9,20)

cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11,2)

after applying thresholding i got almost complete text but also some incorrect text which is not in the id card

and here another problem: block size and constant value need to determine by the developer and same value is not going to work with different images

Please let me know if you have any idea regarding this problem....

edited Jul 27 '23 at 11:28

asked Jul 27 '23 at 10:02

Groot

0 Answers0