Google Vision OCR returning similar character with different ascii values

Question

I have been recently working on extracting text from images using Google Cloud Vision API.

It's giving amazing results, however, I have been stuck at a particular step where I need to compare the extracted text and the same is of different ASCII values and hence doesn't get matched.

For eg:

for i in [77,924,1018,1052]:

    print(chr(i))

The above code will display the characters that look similar to the English character 'M', however, they are all different, and hence when I try to compare, it returns False. This issue is with multiple characters.

It would be really great if any help/suggestions could be provided on how to deal with the same.

Google Vision Code for text extraction:

def detect_text(img_path,x,y,w,h):
    data = []
    client = vision.ImageAnnotatorClient()
    im = cv2.imdecode(np.fromfile(img_path, dtype=np.uint8), cv2.IMREAD_UNCHANGED)[y:y+h, x:x+w]
    _, im_buf_arr = cv2.imencode(".jpg", I'm)
    content = im_buf_arr.tobytes()
    image = vision.Image(content=content)
    response = client.text_detection(image=image)
    texts = response.text_annotations
    for text in texts:
        data.append('\n"{}"'.format(text.description))
    return data
    if response.error.message:
        raise Exception('{}\nFor more info on error messages, check: '
                    'https://cloud.google.com/apis/design/errors'.format(
                        response.error.message))

this will help you -> https://stackoverflow.com/questions/55364164/where-to-use-language-hints-in-google-vision-text-detection-api — InUser, Apr 18 '21 at 11:06

Google Vision OCR returning similar character with different ascii values

0 Answers0