I am working in Python using PyTesseract and OpenCV.
I have a photo that is mixed numbers and letters. The photo is of a date and follows the format DDMMMYY e.g. 01JAN22 Tesseract is having trouble telling the difference between 0 and O and a few other letter and number mix ups.
Is there a way to blacklist / whitelist letters for the specific chars in a string, I know I can blacklist / whitelist out character for the whole image_to_string function using config="-c tessedit_char_blacklist="
.
For example: For char[0] whitelist 0-3 (as its a date it'll be either 0,1,2 or 3.
The below image is an example of what I am working with. Currently tesseract returns the result OSJUNZ2 which is very close to 05JUN22.
Thanks for your help