0

There seems to be two ways to go about this, none seem to work.

First, you can pass tessedit_char_whitelist, but that seems to work only with characters, not patterns:

import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\.../tesseract.exe'

pytesseract.image_to_string(img, config="-c tessedit_char_whitelist=.{5,15}\d{4,8}")

That doesn't work, although you can whitelist characters as 'abcdefgh'.

The second way is this way. I have found the eng.user-patterns file, and entered my RegEx pattern, but I don't know how to set it so it's active. I guess it would be something like:

pytesseract.image_to_string(img, configfile="eng.user-patterns")

However, pytesseract accepts no such argument.

Nicolas Gervais
  • 33,817
  • 13
  • 115
  • 143

1 Answers1

0

RegEx are not supported by tesseract and pytesseract can do nothing with it. tessedit_char_whitelist and user-patterns are different parameter with different effects.

user898678
  • 2,994
  • 2
  • 18
  • 17