Cloud Vision API - PDF OCR

Question

I just tested the Google Cloud Vision API to read the text, if exist, in a image.

Until now I installed the Maven Server and the Redis Server. I just follow the instructions in this page.

https://github.com/GoogleCloudPlatform/cloud-vision/tree/master/java/text

Until now I was able to tested with .jpg files, is it possible to do it with tiff files or pdf??

I am using the following command:

java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar     com.google.cloud.vision.samples.text.TextApp ../../data/text/

Inside the text directory, I have the files in jpg format.

Then to read the converted file, I don't know how to do that, just I run the following command

java -cp target/text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp

And I get the message to enter a word or phrase to search in the converted files. Is there a way to see the whole document transformed?

Thanks!

According to the docs, only image format types are allowed https://cloud.google.com/vision/docs/image-best-practices#image_types — luchosrock, May 23 '16 at 18:54
@luchosrock The link is now https://cloud.google.com/vision/docs/supported-files — jtlz2, Jul 28 '17 at 07:40
is there any other service which does this for PDF files ? I looked into ocr.space but their team isn't very responsive. — d_void, Apr 05 '18 at 20:31

Milan Hlinák · Answer 1 · 2020-10-27T07:25:18.770

On April 6, 2018, support for PDF and TIFF files in document text detection was added to Google Cloud Vision API (see Release Notes).

According to the documentation:

The Vision API can detect and transcribe text from PDF and TIFF files stored in Google Cloud Storage.
Document text detection from PDF and TIFF must be requested using the asyncBatchAnnotate function, which performs an asynchronous request and provides its status using the operations resources.
Output from a PDF/TIFF request is written to a JSON file created in the specified Google Cloud Storage bucket.

Example:

1) Upload a file to your Google Cloud Storage

2) Make a POST request to perform PDF/TIFF document text detection

Request:

POST https://vision.googleapis.com/v1p2beta1/files:asyncBatchAnnotate
Authorization: Bearer <your access token>

{
  "requests":[
    {
      "inputConfig": {
        "gcsSource": {
          "uri": "gs://<your bucket name>/input.pdf"
        },
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "outputConfig": {
        "gcsDestination": {
          "uri": "gs://<your bucket name>/output/"
        },
        "batchSize": 1
      }
    }
  ]
}

Response:

{
  "name": "operations/9b1f9d773d216406"
}

3) Make a GET request to check if document text detection is done

Request:

GET https://vision.googleapis.com/v1/operations/9b1f9d773d216406
Authorization: Bearer <your access token>

Response:

{
    "name": "operations/9b1f9d773d216406",
    "metadata": {
        "@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.OperationMetadata",
        "state": "RUNNING",
        "updateTime": "2018-06-17T20:18:09.117787733Z"
    },
    "done": true,
    "response": {
        "@type": "type.googleapis.com/google.cloud.vision.v1p2beta1.AsyncBatchAnnotateFilesResponse",
        "responses": [
            {
                "outputConfig": {
                    "gcsDestination": {
                        "uri": "gs://<your bucket name>/output/"
                    },
                    "batchSize": 1
                }
            }
        ]
    }
}

4) Check the results in the specified Google Cloud Storage folder

Is it only available to use with GCS (Google Cloud Storage)? The documents (PDF files) that I need to convert to OCR are in Amazon S3... — roadev, Oct 01 '18 at 21:48
How to pass the request in reactjs? because I don't think this works with API's — Harry, Oct 30 '19 at 09:06
how to detect the text in json file that created over Google cloud storage? — Yash, Dec 10 '19 at 17:34
Your JSON file will look like this one: https://jsonblob.com/0e32ae5e-e7c4-11ea-bb30-b16bcb127b1d. For more information please see https://cloud.google.com/vision/docs/pdf and https://cloud.google.com/vision/docs/reference/rest. — Milan Hlinák, Aug 26 '20 at 17:47

score 11 · Accepted Answer · edited Jun 05 '20 at 09:38

11

In 2016 PDF and TIFF formats was not supported for Cloud Vision.

The accepted formats are : (taken from the the doc)

JPEG
PNG8
PNG24
GIF
Animated GIF (first frame only)
BMP
WEBP
RAW
ICO

But now are added.

Docs for jpg:

https://cloud.google.com/vision/docs/ocr

Docs for pdf

https://cloud.google.com/vision/docs/pdf

edited Jun 05 '20 at 09:38

Daniel

7,684
7
52
76

answered May 16 '17 at 12:30

uzerzero

146
2
6

Is there anyway to ingest a TIFF or PDF without going via a temp file? – jtlz2 Jul 23 '17 at 19:59
18

This answer is now outdated since Cloud Vision supports PDFs and TIFFs. – Philip Jul 19 '18 at 12:23
1

It now supports the pdf for ocr as well. https://cloud.google.com/vision/docs/pdf#vision_text_detection_pdf_gcs-nodejs – Sanjay Jul 22 '19 at 07:23
See https://stackoverflow.com/a/50900104/2598453 – Milan Hlinák Jun 22 '22 at 09:54

score 8 · Answer 3 · answered Apr 07 '18 at 00:00

8

https://cloud.google.com/vision/docs/pdf

I know this question is old, but now Google Vision released support for PDF!

answered Apr 07 '18 at 00:00

vokuheila

185
2
11

score 1 · Answer 4 · answered May 04 '21 at 06:08

Now google cloud vision text detection is available in for pdf file as well which detect text in pdf file immediately in synchronous way and doesn't require file to be in google storage it can be in base 64 format.

HTTP method and URL:

POST https://vision.googleapis.com/v1/files:annotate Request JSON body:

{
  "requests": [
    {
      "inputConfig": {
        "content": "base64-encoded-file",
        "mimeType": "application/pdf"
      },
      "features": [
        {
          "type": "DOCUMENT_TEXT_DETECTION"
        }
      ],
      "pages": [
        1,2,3,4,5
      ]
    }
  ]
}

For more information on it visit https://cloud.google.com/vision/docs/file-small-batch

Cloud Vision API - PDF OCR

4 Answers4

Linked