There seems to be two ways to go about this, none seem to work.
First, you can pass tessedit_char_whitelist
, but that seems to work only with characters, not patterns:
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\.../tesseract.exe'
pytesseract.image_to_string(img, config="-c tessedit_char_whitelist=.{5,15}\d{4,8}")
That doesn't work, although you can whitelist characters as 'abcdefgh'
.
The second way is this way. I have found the eng.user-patterns
file, and entered my RegEx pattern, but I don't know how to set it so it's active. I guess it would be something like:
pytesseract.image_to_string(img, configfile="eng.user-patterns")
However, pytesseract
accepts no such argument.