10

I just tested the Google Cloud Vision API to read the text, if exist, in a image.

Until now I installed the Maven Server and the Redis Server. I just follow the instructions in this page.

https://github.com/GoogleCloudPlatform/cloud-vision/tree/master/java/text

Until now I was able to tested with .jpg files, is it possible to do it with tiff files or pdf??

I am using the following command:

java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar     com.google.cloud.vision.samples.text.TextApp ../../data/text/

Inside the text directory, I have the files in jpg format.

Then to read the converted file, I don't know how to do that, just I run the following command

java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp

And I get the message to enter a word or phrase to search in the converted files. Is there a way to see the whole document transformed?

Thanks!

Christian Salvador
  • 311
  • 1
  • 4
  • 12
  • 1
    According to the docs, only image format types are allowed https://cloud.google.com/vision/docs/image-best-practices#image_types – luchosrock May 23 '16 at 18:54
  • 1
    @luchosrock The link is now https://cloud.google.com/vision/docs/supported-files – jtlz2 Jul 28 '17 at 07:40
  • is there any other service which does this for PDF files ? I looked into ocr.space but their team isn't very responsive. – d_void Apr 05 '18 at 20:31

4 Answers4

20

On April 6, 2018, support for PDF and TIFF files in document text detection was added to Google Cloud Vision API (see Release Notes).

According to the documentation:

  • The Vision API can detect and transcribe text from PDF and TIFF files stored in Google Cloud Storage.

  • Document text detection from PDF and TIFF must be requested using the asyncBatchAnnotate function, which performs an asynchronous request and provides its status using the operations resources.

  • Output from a PDF/TIFF request is written to a JSON file created in the specified Google Cloud Storage bucket.


Example:

1) Upload a file to your Google Cloud Storage

enter image description here

2) Make a POST request to perform PDF/TIFF document text detection

Request:

POST https://vision.googleapis.com/v1p2beta1/files:asyncBatchAnnotate
Authorization: Bearer <your access token>

{
  "requests":[
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://<your bucket name>/input.pdf"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "outputConfig": {
        "gcsDestination": {
          "uri": "gs://<your bucket name>/output/"
        },
        "batchSize": 1
      }
    }
  ]
}

Response:

{
  "name": "operations/9b1f9d773d216406"
}

3) Make a GET request to check if document text detection is done

Request:

GET https://vision.googleapis.com/v1/operations/9b1f9d773d216406
Authorization: Bearer <your access token>

Response:

{
    "name": "operations/9b1f9d773d216406",
    "metadata": {
        "@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.OperationMetadata",
        "state": "RUNNING",
        "updateTime": "2018-06-17T20:18:09.117787733Z"
    },
    "done": true,
    "response": {
        "@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.AsyncBatchAnnotateFilesResponse",
        "responses": [
            {
                "outputConfig": {
                    "gcsDestination": {
                        "uri": "gs://<your bucket name>/output/"
                    },
                    "batchSize": 1
                }
            }
        ]
    }
}

4) Check the results in the specified Google Cloud Storage folder

enter image description here

Milan Hlinák
  • 4,260
  • 1
  • 30
  • 41
  • Is it only available to use with GCS (Google Cloud Storage)? The documents (PDF files) that I need to convert to OCR are in Amazon S3... – roadev Oct 01 '18 at 21:48
  • 1
    I think there is no other way than Google Cloud Storage. – Milan Hlinák Oct 02 '18 at 05:08
  • How to pass the request in reactjs? because I don't think this works with API's – Harry Oct 30 '19 at 09:06
  • how to detect the text in json file that created over Google cloud storage? – Yash Dec 10 '19 at 17:34
  • Your JSON file will look like this one: https://jsonblob.com/0e32ae5e-e7c4-11ea-bb30-b16bcb127b1d. For more information please see https://cloud.google.com/vision/docs/pdf and https://cloud.google.com/vision/docs/reference/rest. – Milan Hlinák Aug 26 '20 at 17:47
11

In 2016 PDF and TIFF formats was not supported for Cloud Vision.

The accepted formats are : (taken from the the doc)

  • JPEG
  • PNG8
  • PNG24
  • GIF
  • Animated GIF (first frame only)
  • BMP
  • WEBP
  • RAW
  • ICO

But now are added.

Docs for jpg:

https://cloud.google.com/vision/docs/ocr

Docs for pdf

https://cloud.google.com/vision/docs/pdf

Daniel
  • 7,684
  • 7
  • 52
  • 76
uzerzero
  • 146
  • 2
  • 6
8

https://cloud.google.com/vision/docs/pdf

I know this question is old, but now Google Vision released support for PDF!

vokuheila
  • 185
  • 2
  • 11
1

Now google cloud vision text detection is available in for pdf file as well which detect text in pdf file immediately in synchronous way and doesn't require file to be in google storage it can be in base 64 format.

HTTP method and URL:

POST https://vision.googleapis.com/v1/files:annotate Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "content": "base64-encoded-file",
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

For more information on it visit https://cloud.google.com/vision/docs/file-small-batch