I am testing Pytesseract, and use it to extract digits like the one below.
The image is of fairly decent quality (200 dpi). However, when I run pytesseract, it gives me the result 456-/8-0000, where the digit 7 is misrecognized as '/'. While "/" obviously bears some resemblance to the digit 7, given the high quality of the image, I am still surprised by it.
I tried both
pytesseract.image_to_string(img)
and
pytesseract.image_to_string(img, lang='eng', config='--psm 13 --oem 2 -c tessedit_char_whitelist=0123456789-')
both yielded the same result.
Any pointer in how to improve the accuracy of recognition would be great. Thanks!