This tag is for the Document AI product within Google Cloud Platform.
Questions tagged [cloud-document-ai]
200 questions
0
votes
1 answer
Document AI- Newly created labels aren't taken account by Human in the loop
Human In The Loop worked fine until now, I trained a few versions of a custom extractor processor.
But when I added new labels in the schema, all the labels of type RECORD disapeared from the HITL interface even tho they were correctly extracted.
I…

Xaillooof
- 1
- 1
0
votes
1 answer
Train a custom classifier on Document AI through API
I need to train a custom document classifier though Document AI API, but I couldn't finding anything related in the documentation or code samples. In particular, I was able to define a new custom processor but I don't know how to define my training…

pippofranco
- 11
- 2
0
votes
1 answer
How to recognize characters in International Phonetic Alphabet when OCR
When doing the OCR of a dictionary pdf using DocumentAI, some IPA characters are often included, i.e. ʷ ə etc. Is there a way to recognize them correctly, such as setting a certain language hint? Currently ʷ is recognized as w and ə as a.

jonah_w
- 972
- 5
- 11
0
votes
1 answer
Google Document AI model not reading document in JSON
I have been trying out the various processors (form parser, document OCR and the specialized ones). I am testing it on some purchase order PDFs and therefore using the "purchase order" processor. For some reason, the PDF is scanned and parsed…

figuringitout
- 3
- 2
0
votes
1 answer
OCR multi column text with Google Document AI
I have a document with text in two columns per page. While uploading a test file with this formatting I noticed that the space between the columns was ignored and text was recognized as if it were all a single column page.
The data looks like…

medic17
- 415
- 5
- 16
0
votes
1 answer
How to encourage text detection in Google Cloud Document AI
While using the Google Cloud GUI interface to label some documents for Document AI training, I received an error message "Cannot create labels with empty values".
The image very clearly shows a well contrasted printed 1929. The OCR itself is…

Stephen
- 1,607
- 2
- 18
- 40
0
votes
1 answer
When training and testing a Document AI project, what influences the f1score?
Using the cloud console I trained a model using only one field (to avoid the UI bug that was stopping training altogether) on one set of data. The model f1-scored 0.306 on 50 training images and 50 test images.
I added 150 training images, which…

Stephen
- 1,607
- 2
- 18
- 40
0
votes
1 answer
Is Using Two Custom Processors for DocumentAI Efficient?
What are the best practices for extracting data from multi-sided documents?
I need to extract data from both sides of ID Cards. My current approach is to use two separate custom-trained processors, one for the front side and another for the back…

Daniel Klimek
- 77
- 1
- 7
0
votes
1 answer
how to set language_hints when using Google's DocumentAI to detect text
I'm using Google's DocumentAI to do the text detection from an image file. While the general output is pretty good, if the image contains mixed texts of Chinese and English, some of the Chinese characters can't be detected and are omitted, so I was…

jonah_w
- 972
- 5
- 11
0
votes
0 answers
document.pages is empty when sending a DocumentAI batch process documents request
I'm using the following Google sample code to do the batch process of DocumentAI OCR:
https://github.com/GoogleCloudPlatform/python-docs-samples/blob/HEAD/documentai/snippets/batch_process_documents_sample.py
Instead of print(document.text), I was…

jonah_w
- 972
- 5
- 11
0
votes
1 answer
Google DocumentAI not output in the right order of blocks
I'm trying to OCR this image using Google's DocumentAI. But it seems to output the text in completely wrong orders.
Here's the image:
The output is as follows:
piece of clothing,
= do up
zip up something
with difficulty.
zip something up
you…

jonah_w
- 972
- 5
- 11
0
votes
0 answers
Certificate Verify Failed when using Google's DocumentAI
I am using Google's Document AI to do OCR on PDF's. I have been using this script for close to a year without issue, but now it is giving me this certificate verify failed error that I am unable to find any support on either in Google's…
0
votes
1 answer
How can I resolve a 'Permission Denied' error while using Document AI API with OAuth2.0?
I am having an issue with Document AI Authorization. When I create an access token and send it with the document ai API it shows me Permission Denied Error. The Oauth2.0 is only working for those users who are added as an owner to IAM…
0
votes
1 answer
Google Document AI training fails to error for fields that don't exist
I am currently in the process of training a new document processor with Google's Document AI. I have 16 training documents and 10 testing documents, are easily within the minimums illustrated by Google. However when I attempt to train the processor,…

Doug Niccum
- 196
- 4
- 16
0
votes
1 answer
Convert ProcessResponse to json and then back to a ProcessResponse object again
I'm working on a document AI implementation where we are sending multiple related requests gathering the responses and re-combining them. This is all done in C#. As part of this I need to convert a ProcessResponse to a json string and then…

mpmarven
- 3
- 1