Can we improve Tesseract character recognition accuracy by regular expression. For example we tell to Tesseract that the text can have this kind of structure.
4characters2Digits[4Digits]3char4Digits2char
// Our string in the image is "abcd12[2222]aBc000AB"
// Our regular expression can be like this
String reg = "[a-zA-Z]{4}\d{2}\[\d{4}\][a-zA-Z]{3}\d{3}[a-zA-Z]{2}";
I think this kind Tesseract will do better recognition for characters.
And We also can set
tesseract.setTessVariable("tessedit_char_whitelist", "0123456789[]abc...Z");
Note: I am using Java Language. Tess4j
Thank you!