0

△ means minus ('-') as a business rule. How can I read the following images as expected.

  • Input image 1 (expected value is -74,523)

enter image description here

  • Input image 2 (expected value is -1,794,306)

enter image description here

  • Actual result
$ tesseract 1.png stdout -l eng --psm 4
£74 523

$ tesseract 2.png stdout -l eng --psm 4
a 1,794,306
  • Version
$ tesseract -v
tesseract 4.1.1-rc2-22-g08899

Currently, the non numeric value is converted programmatically to '-'. But it is not working always as shown below.

// Example. △ is read as '4'
tesseract x.png stdout -l eng --psm 4
474 523
zono
  • 8,366
  • 21
  • 75
  • 113
  • With the data string returned from pytesseract, you can perform string replacement. So `replace()` the character in the returned string – nathancy Feb 25 '20 at 22:13

1 Answers1

0

OpenCV has the searching and replacement image function as shown below.

  • Search image

enter image description here

  • Input

enter image description here

  • Output

enter image description here

  • Code
import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt
img_rgb = cv.imread('images/input.png')
img_gray = cv.cvtColor(img_rgb, cv.COLOR_BGR2GRAY)
template = cv.imread('images/minus.png', 0)
w, h = template.shape[::-1]
res = cv.matchTemplate(img_gray, template, cv.TM_CCOEFF_NORMED)
threshold = 0.8
loc = np.where(res >= threshold)
for pt in zip(*loc[::-1]):
    cv.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (255, 255, 255), -1)
    cv.line(img_rgb, (pt[0], pt[1] + int(h / 2)),
            (pt[0] + w, pt[1] + int(h / 2)), (0, 0, 0), 2)
cv.imwrite('out/detected-minus.png', img_rgb)
  • Run
# tesseract out/detected-minus.png stdout -l eng --psm 4
— 2,196,193
  • Reference

https://docs.opencv.org/master/d4/dc6/tutorial_py_template_matching.html

zono
  • 8,366
  • 21
  • 75
  • 113