0

I have a picture of plate with text "991AAA". I binarized it with opencv, found necessary contours and gave them to Tesseract ocr. But it reads this as "ЧЧЛААА" (rus lang, I guess '9' kinda looks like 'Ч' but no). Is there problem with tesseract config ("-l rus+eng --psm 10")? Am I missing something? Image is looking good and if I give it just piece with single "9" it will read it, so I don't think that it's about image. How can I make it better? P.S. If I change config to "-l eng --psm 10" it will give me "O9-]AAA". Well, at least single "9" is recognized

  • 1
    Can you add a sample starting image to your question? – user3169 Dec 11 '18 at 21:47
  • In general, I don't think there is an easy way to do that. If you switch to c++ API - you would be able to get RIL_SYMBOL iterator and go through symbol alternative and make sure that they follow the pattern. With earlier versions of tesseract (without LSTM) you could specify the user-pattern, but that solely wouldn't have solved your issue anyway, because it only slightly increases the probability of getting what you are expecting. I think that for plate recognition you would be better with some neural networks because you have a single font & plenty of training data online. – Dmitrii Z. Dec 12 '18 at 21:23
  • @DmitriiZ. I have zero experience of using neural networks, and don't have much time to finish project. Should I try to train tesseract or it's better to find another solution for recognition? –  Dec 16 '18 at 13:33

0 Answers0