OCR doesn't work well on large images (with much text) - Google Cloud Vision API

Question

We noticed that Google Vision API doesn't work well if an image has a lot of text. It returns 'strange' results.

Here is an exapmle:

https://www.dropbox.com/s/vhqxxwgj4stvfc9/screenwithproblem.jpg?dl=0 - Will return something like this: https://www.dropbox.com/s/r3gkn38rw36agvs/Screenshot%202016-11-30%2011.26.20.jpg?dl=0

If we send just the part of that image, everything will be fine. It can be checked via demo page of API too (cloud.google.com/vision).

We tried on different images and get the same problem.

Can you advise us if we are doing something wrong or this is problem on Google's side?

Thank you in advanced!

This does not sound like programming question. Why don't you just write to Google support? — Tomato, Nov 30 '16 at 17:52
I thought I missed something in the API code. I will try to contact Google Support. Thank you. — Nimbus Web, Nov 30 '16 at 22:41
While I understand that this doesn't change StackOverflow's policies, the very first "support method" listed on the Google Vision site is, "Ask a question about Google Cloud Vision API on Stack Overflow. Please use the tag google-cloud-vision for questions about Cloud Vision API. This tag not only receives responses from the Stack Overflow community, but also from Google engineers, who monitor the tag and offer unofficial support." — MustModify, Dec 02 '16 at 23:01

score 0 · Answer 1 · answered Oct 31 '18 at 09:07

I have notice some of these same "strange results" in documents particularly in lower quality areas of documents where the print is faded or blurred. It seems that in some of the cases, the API is guessing an incorrect language for the text.

Each page of your result should tell you what percent of the page was detected as certain languages.

"property": {
          "detectedLanguages": [
            {
              "languageCode": "en",
              "confidence": 0.82
            },
            {
              "languageCode": "it",
              "confidence": 0.08
            },
            {
              "languageCode": "es",
              "confidence": 0.07
            }
          ]
        }

If this is the case here you may want to try with a predefined list of languages (or one language if known) to reduce the number of erroneous languages detected. (https://cloud.google.com/nodejs/docs/reference/vision/0.22.x/google.cloud.vision.v1p1beta1#.AnnotateImageRequest)

OCR doesn't work well on large images (with much text) - Google Cloud Vision API

1 Answers1