1

I use Tesseract5 and pytesseract

My picture is:

1

I tried different methods for pre-processing: scale, resize, binarization, blur, dilate and etc

In the same time it works fine for "!?#@abc!!"

Will be glad of any advice

eshirvana
  • 23,227
  • 3
  • 22
  • 38
Pablo
  • 11
  • 1

1 Answers1

0

Wow, that's a lot of punctuation!

Models of English language text will predict that "bang bang bang bang, followed by bang" occurs with very low probability. This impacts recognition accuracy.

For such specialized inputs, you will want to train a new language model: https://stackoverflow.com/a/13556952/8431111

J_H
  • 17,926
  • 4
  • 24
  • 44
  • Thank you! I updated eng.traineddata but looks like this solution doesn't work in my case Could you please help me with training by images and tesstrain project? I don't understand how to update ENG model by my images and text files – Pablo Jan 23 '22 at 19:29