0

We are looking to translate images found in PDF documents from different languages to English. They are scanned images and many times have tables or some structure in them. We would like to translate to English but preserve the structure of document as much possible. Hence just a pure text based translation doesn't suffice.

We saw the Google translate app on Android which seems to do something similar with photos on phone.is there a Google cloud API which does the same?

In order to do this over the Google cloud, which API should we use?

double-beep
  • 5,031
  • 17
  • 33
  • 41
cuddp
  • 1
  • 1

2 Answers2

0

Using Google Cloud products, you can achieve this using an OCR to extract text and translate API to translate the text to English.

I suggest to use Document AI for OCR since the API is designed to parse forms and tables. You can check Document AI Table parsing and Document AI Document parsing for examples on how to use the API. Using the extracted text, you can use Translate API to translate the extracted text.

High level steps:

  1. Use Document AI to extract data from pdf files
  2. Use Translate API to translate the extracted data to English
Ricco D
  • 6,873
  • 1
  • 8
  • 18
  • thanks, couple of questions, a) how do I preserve the structure of the original document, if it contains tables how can I get back the original structure, do I have to work to put back the same structure byyself.. will these apis do that or will it just convert to raw text? also, will the results be the same as the translate app found in android ? – cuddp Jul 21 '21 at 11:54
  • The output of the APIs are all in text. The response of Document AI will classify the document usually by objects (pages, paragraph, lines, tables). If you have a table it will divide it by rows and then cells. Your data will be inside those cells and are ready for extraction. It will be up to you on how you will be using the data extracted. For reference you can see the response [here](https://cloud.google.com/document-ai/docs/reference/rest/v1/Document). – Ricco D Jul 22 '21 at 00:37
  • The quality of results of the android app should be the same. But the API will just return a JSON response and it will be up to you on how you will be using/printing the results. – Ricco D Jul 22 '21 at 00:37
  • FYI, there is an actively monitored tag for Document AI [`[cloud-document-ai]`](https://stackoverflow.com/questions/tagged/cloud-document-ai) – Holt Skinner Mar 28 '23 at 21:16
0

This answer will work to translate the text of the document, but it won't do it in place.

Here are some newer options:

Holt Skinner
  • 1,692
  • 1
  • 8
  • 21