1

I'm trying to detect handwritten dates using the Google Vision API. Do you know if it is possible to force it to detect dates (DD/MM/YYYY), or at least numbers only to increase reliablity?

The function I use, takes an Image as np.array as input:

def detect_handwritten_text(img):
"""Recognizes characters using the Google Cloud Vision API.
Args:
    img(np.array) = The Image on which to apply the OCR.

Returns:
    The recognized content of img as string.
"""

from google.cloud import vision_v1p3beta1 as vision
client = vision.ImageAnnotatorClient()

# Transform np.array image format into vision api readable byte format
sucess, encoded_image = cv.imencode('.png', img)
content = encoded_image.tobytes()

# Configure client to detect handwriting and load picture
image = vision.types.Image(content=content)
image_context = vision.types.ImageContext(language_hints=['en-t-i0-handwrit'])

response = client.document_text_detection(image=image, image_context=image_context)
return response.full_text_annotation.text
Slyder
  • 11
  • 2

1 Answers1

-1

After ImageAnnotatorClient.DetectDocumentText(your image), you could iterate over Blocks and words inside each block, and try to match a regular expression over each word to find dates and numbers.

  • The question is about forcing google vision to find digits. For example, if normal google vision OCR result is `1b/O8/201B` after forcing google vision, it must find `16/08/2018`. – Arun Gowda Jun 04 '19 at 02:50