OCR confidence score from Google Vision API

Question

I am using Google Vision OCR for extracting text from images in python.
Using the following code snippet.
However, the confidence score always shows 0.0 which is definitely incorrect.

How to extract the OCR confidence score for individual char or word from the Google response?

 content = cv2.imencode('.jpg', cv2.imread(file_name))[1].tostring()
 img = types.Image(content=content)
 response1 = client.text_detection(image=img, image_context={"language_hints": ["en"]})
 response_annotations = response1.text_annotations
 for x in response1.text_annotations:
      print(x)
      print(f'confidence:{x.confidence}')

Ex: output for an iteration

description: "Date:"
bounding_poly {
  vertices {
    x: 127
    y: 11
  }
  vertices {
    x: 181
    y: 10
  }
  vertices {
    x: 181
    y: 29
  }
  vertices {
    x: 127
    y: 30
  }
}

confidence:0.0

Tried to post the image in the demo api? different results? also maybe remove the language hint will have some impact — InUser, Jul 02 '20 at 03:41
demo api? can elaborate further. It OCRed perfectly, even the number of spaces was found correctly with case of each character. Its the confidence for it is Zero which doesn't add up — letsBeePolite, Jul 02 '20 at 04:03
try here -> https://cloud.google.com/vision, same confidence? — InUser, Jul 02 '20 at 04:11

score 2 · Answer 1 · answered Jul 16 '20 at 12:12

I managed to reproduce your issue. I used the following function and obtained confidence 0.0 for all items.

from google.cloud import vision

def detect_text_uri(uri):
    client = vision.ImageAnnotatorClient()
    image = vision.types.Image()
    image.source.image_uri = uri

    response = client.text_detection(image=image)
    texts = response.text_annotations
    print('Texts:')

    for text in texts:
        print('\n"{}"'.format(text.description))

        vertices = (['({},{})'.format(vertex.x, vertex.y)
                    for vertex in text.bounding_poly.vertices])

        print('bounds: {}'.format(','.join(vertices)))
        print("confidence: {}".format(text.confidence))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))

However, when using the same image with the "Try the API" option in the documentation I obtained a result with confidences non 0. This happened also when detecting text from a local image.

One should expect confidences to have the same value using both methods. I've opened an issue tracker, check it here.

Note, ["Try the API"](https://cloud.google.com/vision/docs/drag-and-drop) seems to use the `DOCUMENT_TEXT_DETECTION` feature, not `TEXT_DETECTION`. Using [`document_text_detection()`](https://googleapis.dev/python/vision/1.0.0/gapic/v1/api.html#google.cloud.vision_v1.ImageAnnotatorClient.document_text_detection) instead of the [`text_detection()`](https://googleapis.dev/python/vision/1.0.0/gapic/v1/api.html#google.cloud.vision_v1.ImageAnnotatorClient.text_detection) seems to keep the confidence when calling from code. — Klesun, Aug 02 '21 at 09:49

score 2 · Answer 2 · edited Aug 02 '21 at 09:47

Working code that retrieves the right confidence values of GOCR response.

(using document_text_detection() instead of text_detection())

def detect_document(path):
    """Detects document features in an image."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

    # [START vision_python_migration_document_text_detection]
    with io.open(path, 'rb') as image_file:
        content = image_file.read()

    image = vision.types.Image(content=content)

    response = client.document_text_detection(image=image)

    for page in response.full_text_annotation.pages:
        for block in page.blocks:
            print('\nBlock confidence: {}\n'.format(block.confidence))

            for paragraph in block.paragraphs:
                print('Paragraph confidence: {}'.format(
                    paragraph.confidence))

                for word in paragraph.words:
                    word_text = ''.join([
                        symbol.text for symbol in word.symbols
                    ])
                    print('Word text: {} (confidence: {})'.format(
                        word_text, word.confidence))

                    for symbol in word.symbols:
                        print('\tSymbol: {} (confidence: {})'.format(
                            symbol.text, symbol.confidence))

    if response.error.message:
        raise Exception(
            '{}\nFor more info on error messages, check: '
            'https://cloud.google.com/apis/design/errors'.format(
                response.error.message))
    # [END vision_python_migration_document_text_detection]
# [END vision_fulltext_detection]

# add your own path
path = "gocr_vision.png"
detect_document(path)

changing the method is not a solution. document_text_detection doesn't have the goal as text_detection — hzitoun, Sep 11 '20 at 12:43

OCR confidence score from Google Vision API

2 Answers2