Google DocumentAI -> ValueError: Protocol message Document has no "file" field

Question

In my script, I have the following:

response = requests.get(list_url[0], allow_redirects=True)
s = io.BytesIO()
s.write(response.content)
s.seek(0)
mimetype="application/octet-stream"
document = {'file': s.read(), 'mime': mimetype}
request = {"name": name, "document": document}

However, when I send a request to the server:

result = client.process_document(request=request)

I get ValueError: Protocol message Document has no "file" field.

Is this due because google docAI doesn't accept octet-stream?

Betjens · Accepted Answer · 2022-04-01T12:34:14.613

I checked the latest version code of the document ai python client DocumentProcessorServiceClient and found this function pass on its request field a Process Request object. You can check details of that function on the process_document github code page.

Process Request will accept either inline_document or a raw_document (both are mutual exclusive). Based on your code it looks like you are passing a raw_document which only accepts fields content and mime_type which should be used instead of file and mime.

If you check the sample of using python library client for document ai you will find this lines which explain how it should be implemented:

...
    document = {"content": image_content, "mime_type": "application/pdf"}

    # Configure the process request
    request = {"name": name, "raw_document": document}

    result = client.process_document(request=request)
...

For additional details, you can check the official github project for document ai and the official google page for the python client library.

Hello @An old man in the sea, were you able to address your issue? — Betjens, Apr 01 '22 at 15:16
Hi Betjens, yes I was, but by a round about way. Regardless, I decided to accept your answer. Thanks for the help. ;) — An old man in the sea., Apr 05 '22 at 17:43

Google DocumentAI -> ValueError: Protocol message Document has no "file" field

1 Answers1