0

I am using pytesseract for license plate recognition, but what I am trying to do is improve accuracy by providing tesseract with whitelist of words, so it can only output things from the whitelist. As for now, I am using this command:

text = pytesseract.image_to_string(img, lang="eng", config="--psm 7 -c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ ")

I tried adding --user-words /absolute/path/to/eng.user-words, but it apparently changes nothing.

My eng.user-words is just text file, where each line is one word, so it should be fine.

I also tried adding bazar config, as described here, but it also changed nothing.

I would appreciate help with this practicular problem, or any other tips regarding how can I use pytesseract or other OCR library to recognize single line of text, and provide it with the whitelist, as it would improve accuracy in my use case dramatically.

Dolidod Teethtard
  • 553
  • 1
  • 7
  • 22
  • Solves this your question [custom-dictionary-for-tesseract](https://stackoverflow.com/a/13556952/20851944) – Hermann12 Apr 09 '23 at 07:18

0 Answers0