Tesseract can't recognize exclamation marks

Question

I use Tesseract5 and pytesseract

My picture is:

I tried different methods for pre-processing: scale, resize, binarization, blur, dilate and etc

In the same time it works fine for "!?#@abc!!"

Will be glad of any advice

add a more "learned" data dictionary. https://stackoverflow.com/questions/9568165/custom-dictionary-for-tesseract — apollosoftware.org, Jan 08 '22 at 20:56

score 0 · Answer 1 · answered Jan 08 '22 at 21:32

0

Wow, that's a lot of punctuation!

Models of English language text will predict that "bang bang bang bang, followed by bang" occurs with very low probability. This impacts recognition accuracy.

For such specialized inputs, you will want to train a new language model: https://stackoverflow.com/a/13556952/8431111

answered Jan 08 '22 at 21:32

J_H

17,926
4
24
44

Thank you! I updated eng.traineddata but looks like this solution doesn't work in my case Could you please help me with training by images and tesstrain project? I don't understand how to update ENG model by my images and text files – Pablo Jan 23 '22 at 19:29

Tesseract can't recognize exclamation marks

1 Answers1