This might not be the answer you are looking for, but I faced a similar problem with tesseract a few months ago. You might want to take a look at whitelisting, more specifically, whitelisting all digits. Like this,
pytesseract.image_to_string(question_img, config="-c tessedit_char_whitelist=0123456789. -psm 6")
This however did not work for me, so I ended up using opencv knn, this does mean you need to know where each char is located though... First I stored some images of the characters I wanted to recognize. And added those detections to a temporary file:
frame[y:y + h, x:x + w].copy().flatten()
After labeling all those detections I trained them using the previously mentioned knn.
network = cv2.ml.KNearest_create()
network.train(data, cv2.ml.ROW_SAMPLE, labels)
network.save('pattern')
Now all chars can be analysed using.
chars = [
frame[y1:y1 + h, x1:x1 + w].copy().flatten(), #char 1
frame[y2:y2 + h, x2:x2 + w].copy().flatten(), #char 2
frame[yn:yn + h, xn:xn + w].copy().flatten(), #char n
]
output = ''
network = cv2.ml.KNearest_create()
network.load('pattern')
for char in chars:
ret, results, neighbours, dist = network.findNearest([char.astype(np.float32)], 3)
output = '{0}'.format(result)
After this you can just do your regex on your string. Total training and labeling only took me something like 2 hours so should be quite doable.